GPCRsort-Responding to the Next Generation Sequencing Data Challenge: Prediction of G Protein-Coupled Receptor Classes Using Only Structural Region Lengths

Sahın M. E. , Can T. , Son Ç. D.

OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY, vol.18, pp.636-644, 2014 (Journal Indexed in SCI) identifier identifier identifier

  • Publication Type: Article / Article
  • Volume: 18
  • Publication Date: 2014
  • Doi Number: 10.1089/omi.2014.0073
  • Page Numbers: pp.636-644


Next generation sequencing (NGS) and the attendant data deluge are increasingly impacting molecular life sciences research. Chief among the challenges and opportunities is to enhance our ability to classify molecular target data into meaningful and cohesive systematic nomenclature. In this vein, the G protein-coupled receptors (GPCRs) are the largest and most divergent receptor family that plays a crucial role in a host of pathophysiological pathways. For the pharmaceutical industry, GPCRs are a major drug target and it is estimated that 60%-70% of all medicines in development today target GPCRs. Hence, they require an efficient and rapid classification to group the members according to their functions. In addition to NGS and the Big Data challenge we currently face, an emerging number of orphan GPCRs further demand for novel, rapid, and accurate classification of the receptors since the current classification tools are inadequate and slow. This study presents the development of a new classification tool for GPCRs using the structural features derived from their primary sequences: GPCRsort. Comparison experiments with the current known GPCR classification techniques showed that GPCRsort is able to rapidly (in the order of minutes) classify uncharacterized GPCRs with 97.3% accuracy, whereas the best available technique's accuracy is 90.7%. GPCRsort is available in the public domain for postgenomics life scientists engaged in GPCR research with NGS: