Effective Enrichment of Gene Expression Data Sets

Sirin U., Erdogdu U., TAN M., POLAT F., Alhajj R.

11th IEEE International Conference on Machine Learning and Applications (ICMLA), Florida, United States Of America, 12 - 15 December 2012, pp.76-81 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/icmla.2012.22
  • City: Florida
  • Country: United States Of America
  • Page Numbers: pp.76-81
  • Keywords: gene expression data, sample generation, multiple perspectives, learning, gene regulation modeling, probabilistic boolean networks, ordinary differential equations, REGULATORY NETWORKS, MICROARRAY, IDENTIFICATION, MODEL
  • Middle East Technical University Affiliated: Yes


The ever-growing need for gene-expression data analysis motivates studies in sample generation due to the lack of enough gene-expression data. It is common that there are thousands of genes but only tens or rarely hundreds of samples available. In this paper, we attempt to formulate the sample generation task as follows: first, building alternative Gene Regulatory Network (GRN) models; second, sampling data from each of them; and then filtering the generated samples using metrics that measure compatibility, diversity and coverage with respect to the original dataset. We constructed two alternative GRN models using Probabilistic Boolean Networks and Ordinary Differential Equations. We developed a multi-objective filtering mechanism based on the three metrics to assess the quality of the newly generated data. We presented a number of experiments to show effectiveness and applicability of the proposed multi-model framework.