Batch Mode TD(lambda) for Controlling Partially Observable Gene Regulatory Networks


Sirin U., POLAT F., Alhajj R.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, vol.14, no.6, pp.1214-1227, 2017 (SCI-Expanded) identifier identifier identifier

  • Publication Type: Article / Article
  • Volume: 14 Issue: 6
  • Publication Date: 2017
  • Doi Number: 10.1109/tcbb.2016.2595577
  • Journal Name: IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.1214-1227
  • Keywords: Batch mode reinforcement learning, temporal-difference learning, gene regulatory networks, gene expression, gene regulation, PROBABILISTIC BOOLEAN NETWORKS, EXTERNAL CONTROL
  • Middle East Technical University Affiliated: Yes

Abstract

External control of gene regulatory networks (GRNs) has received much attention in recent years. The aim is to find a series of actions to apply to a gene regulation system making it avoid its diseased states. In this work, we propose a novel method for controlling partially observable GRNs combining batch mode reinforcement learning (Batch RL) and TD(lambda) algorithms. Unlike the existing studies inferring a computational model from gene expression data, and obtaining a control policy over the constructed model, our idea is to interpret the time series gene expression data as a sequence of observations that the system produced, and obtain an approximate stochastic policy directly from the gene expression data without estimation of the internal states of the partially observable environment. Thereby, we get rid of the most time consuming phases of the existing studies, inferring a model and running the model for the control. Results show that our method is able to provide control solutions for regulation systems of several thousands of genes only in seconds, whereas existing studies cannot solve control problems of even a few dozens of genes. Results also show that our approximate stochastic policies are almost as good as the policies generated by the existing studies.