A Review on Data Mining and Continuous Optimization Applications in Computational Biology and Medicine

Weber G., Ozogur-Akyuz S., Kropat E.

BIRTH DEFECTS RESEARCH PART C-EMBRYO TODAY-REVIEWS, vol.87, no.2, pp.165-181, 2009 (SCI-Expanded) identifier identifier identifier

  • Publication Type: Article / Review
  • Volume: 87 Issue: 2
  • Publication Date: 2009
  • Doi Number: 10.1002/bdrc.20151
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Agricultural & Environmental Science Database
  • Page Numbers: pp.165-181
  • Keywords: gene-environment networks, computational biology, diseases, birth defects, classification, support vector machines, machine learning, model selection, generalized semi-infinite programming, errors, uncertainty, infinity, modeling, dynamical system, intervals, matrix, structural stability, conic programming, continuous, discrete, hybrid, medicine, health care, GENERALIZED SEMIINFINITE OPTIMIZATION, GENETIC NETWORKS, EXPRESSION, STABILITY, SYSTEM
  • Middle East Technical University Affiliated: Yes


An emerging research area in computational biology and biotechnology is devoted to mathematical modeling and prediction of gene-expression patterns; it nowadays requests mathematics to deeply understand its foundations. This article surveys data mining and machine learning methods for an analysis of complex systems in computational biology, It mathematically deepens recent advances in modeling and prediction by rigorously introducing the environment and aspects of errors and uncertainty into the genetic context within the framework of matrix and interval arithmetics. Given the data from DNA microarray experiments and environmental measurements, we extract nonlinear ordinary differential equations which contain parameters that are to be determined. This is done by a generalized Chebychev approximation and generalized semi-infinite optimization. Then, time-discretized dynamical systems are studied. By a combinatorial algorithm which constructs and follows polyhedra sequences, the region of parametric stability is detected. In addition, we analyze the topological landscape of gene-environment networks in terms of structural stability. As a second strategy, we will review recent model selection and kernel learning methods for binary classification which can be used to classify microarray data for cancerous cells or for discrimination of other kind of diseases. This review is practically motivated and theoretically elaborated; it is devoted to a contribution to better health care, progress in medicine, a better education, and more healthy living conditions. Birth Defects Research (Part C) 87:165-181, 2009. (c) 2009 Wiley-Liss, Inc.