Selection of representative SNP sets for genome-wide association studies: a metaheuristic approach

Ustunkar G., Akyüz S., Weber G. W., Friedrich C. M., Aydin Son Y.

OPTIMIZATION LETTERS, vol.6, pp.1207-1218, 2012 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 6
  • Publication Date: 2012
  • Doi Number: 10.1007/s11590-011-0419-7
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.1207-1218
  • Keywords: Simulated annealing, Feature selection, Representative SNP selection, SNP-complex disease association, Bioinformatics, OR in computational biology, ALGORITHM
  • Middle East Technical University Affiliated: Yes


After the completion of Human Genome Project in 2003, it is now possible to associate genetic variations in the human genome with common and complex diseases. The current challenge now is to utilize the genomic data efficiently and to develop tools to improve our understanding of etiology of complex diseases. Many of the algorithms needed to deal with this task were originally developed in management science and operations research (OR). One application is to select a subset of the Single Nucleotide Polymorphism (SNP) biomarkers from the whole SNP set that is informative and small enough for subsequent association studies. In this paper, we present an OR application for representative SNP selection that implements our novel Simulated Annealing (SA) based feature-selection algorithm. We hope that our work will facilitate reliable identification of SNPs that are involved in the etiology of complex diseases and ultimately support timely identification of genomic disease biomarkers and the development of personalized-medicine approaches and targeted drug discoveries.