Automated biological data acquisition and integration using machine learning techniques


Tezin Türü: Doktora

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü, Türkiye

Tezin Onay Tarihi: 2009

Öğrenci: LEVENT ÇARKACIOĞLU

Danışman: MEHMET VOLKAN ATALAY

Özet:

Since the initial genome sequencing projects along with the recent advances on technology, molecular biology and large scale transcriptome analysis result in data accumulation at a large scale. These data have been provided in different platforms and come from different laboratories therefore, there is a need for compilation and comprehensive analysis. In this thesis, we addressed the automatization of biological data acquisition and integration from these non-uniform data using machine learning techniques. We focused on two different mining studies in the scope of this thesis. In the first study, we worked on characterizing expression patterns of housekeeping genes. We described methodologies to compare measures of housekeeping genes with non-housekeeping genes. In the second study, we proposed a novel framework, bi-k-bi clustering, for finding association rules of gene pairs that can easily operate on large scale and multiple heterogeneous data sets. Results in both studies showed consistency and relatedness with the available literature. Furthermore, our results provided some novel insights waiting to be experimented by the biologists.