Mixed effects models for time series gene expression data


Tezin Türü: Doktora

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Fen Edebiyat Fakültesi, İstatistik Bölümü, Türkiye

Tezin Onay Tarihi: 2011

Öğrenci: İBRAHİM ERKAN

Eş Danışman: ÖZLEM İLK DAĞ, İNCİ BATMAZ

Özet:

The experimental factors such as the cell type and the treatment may have different impact on expression levels of individual genes which are quantitative measurements from microarrays. The measurements can be collected at a few unevenly spaced time points with replicates. The aim of this study is to consider cell type, treatment and short time series attributes and to infer about their effects on individual genes. A mixed effects model (LME) was proposed to model the gene expression data and the performance of the model was validated by a simulation study. Realistic data sets were generated preserving the structure of the sample real life data studied by Nymark et al. (2007). Predictive performance of the model was evaluated by performance measures, such as accuracy, sensitivity and specificity, as well as compared to the competing method by Smyth (2004), namely Limma. Both methods were also compared on real life data. Simulation results showed that the predictive performance of LME is as high as 99%, and it produces False Discovery Rate (FDR) as low as 0.4% whereas Limma has an FDR value of at least 32%. Moreover, LME has almost 99% predictive capability on the continuous time parameter where Limma has only about 67% and even it cannot handle continuous independent variables.