Batch Mode TD(lambda) for Controlling Partially Observable Gene Regulatory Networks


Sirin U., POLAT F., Alhajj R.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, cilt.14, sa.6, ss.1214-1227, 2017 (SCI-Expanded) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 14 Sayı: 6
  • Basım Tarihi: 2017
  • Doi Numarası: 10.1109/tcbb.2016.2595577
  • Dergi Adı: IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.1214-1227
  • Anahtar Kelimeler: Batch mode reinforcement learning, temporal-difference learning, gene regulatory networks, gene expression, gene regulation, PROBABILISTIC BOOLEAN NETWORKS, EXTERNAL CONTROL
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

External control of gene regulatory networks (GRNs) has received much attention in recent years. The aim is to find a series of actions to apply to a gene regulation system making it avoid its diseased states. In this work, we propose a novel method for controlling partially observable GRNs combining batch mode reinforcement learning (Batch RL) and TD(lambda) algorithms. Unlike the existing studies inferring a computational model from gene expression data, and obtaining a control policy over the constructed model, our idea is to interpret the time series gene expression data as a sequence of observations that the system produced, and obtain an approximate stochastic policy directly from the gene expression data without estimation of the internal states of the partially observable environment. Thereby, we get rid of the most time consuming phases of the existing studies, inferring a model and running the model for the control. Results show that our method is able to provide control solutions for regulation systems of several thousands of genes only in seconds, whereas existing studies cannot solve control problems of even a few dozens of genes. Results also show that our approximate stochastic policies are almost as good as the policies generated by the existing studies.