Efficient partially observable markov decision process based formulation of gene regulatory network control problem

UTKU ERDOĞDU

Efficient partially observable markov decision process based formulation of gene regulatory network control problem

Tezin Türü: Doktora

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü, Türkiye

Tezin Onay Tarihi: 2012

Öğrenci: UTKU ERDOĞDU

Danışman: FARUK POLAT

Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu

Özet:

The need to analyze and closely study the gene related mechanisms motivated the research on the modeling and control of gene regulatory networks (GRN). Di erent approaches exist to model GRNs; they are mostly simulated as mathematical models that represent relationships between genes. Though it turns into a more challenging problem, we argue that partial observability would be a more natural and realistic method for handling the control of GRNs. Partial observability is a fundamental aspect of the problem; it is mostly ignored and substituted by the assumption that states of GRN are known precisely, prescribed as full observability. On the other hand, current works addressing partially observability focus on formulating algorithms for the nite horizon GRN control problem. So, in this work we explore the feasibility of realizing the problem in a partially observable setting, mainly with Partially Observable Markov Decision Processes (POMDP). We proposed a POMDP formulation for the in nite horizon version of the problem. Knowing the fact that POMDP problems su er from the curse of dimensionality, we also proposed a POMDP solution method that automatically decomposes the problem by isolating di erent unrelated parts of the problem, and then solves the reduced subproblems. We also proposed a method to enrich gene expression data sets given as input to POMDP control task, because in available data sets there are thousands of genes but only tens or rarely hundreds of samples. The method is based on the idea of generating more than one model using the available data sets, and then sampling data from each of the models and nally ltering the generated samples with the help of metrics that measure compatibility, diversity and coverage of the newly generated samples.