PREDICTION OF TRANSMEMBRANE REGIONS OF G PROTEIN-COUPLED RECEPTORS USING MACHINE LEARNING TECHNIQUES

MUAZZEZ ÇELEBİ ÇINAR

PREDICTION OF TRANSMEMBRANE REGIONS OF G PROTEIN-COUPLED RECEPTORS USING MACHINE LEARNING TECHNIQUES

Tezin Türü: Yüksek Lisans

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Fen Bilimleri Enstitüsü, Türkiye

Tezin Onay Tarihi: 2019

Tezin Dili: İngilizce

Öğrenci: MUAZZEZ ÇELEBİ ÇINAR

Asıl Danışman (Eş Danışmanlı Tezler İçin): Çağdaş Devrim Son

Eş Danışman: Tolga Can

Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu

Özet:

G protein-coupled receptors (GPCRs) are one of the largest and the most significant membrane receptor families in eukaryotes. They transmit extracellular stimuli to the inside of the cell by undergoing conformational changes. GPCRs can recognize a diversity of extracellular ligands including hormones, neurotransmitters, odorants, photons, and ions. These receptors are associated with a variety of diseases in hu-mans such as cancer and central nervous system disorders, and can be proclaimed as one of the most important targets for the pharmaceutical industry. They have seven transmembrane helices that contain essential regions such as ligand binding sites, ac-tuator protein (e.g. G protein) binding sites and cholesterol binding sites. There is a large gap in topology data for membrane proteins due to the experimental limita-tions resulting from unstability of the membrane. In UniProt, which is a freely avail-able database of protein sequences and structural and functional information, only 29 GPCRs among the thousands have experimentally solved transmembrane (TM) re-gion data. The topology information of other membrane proteins is provided using the TMHMM prediction tool, which is based on hidden Markov models. However, it incorrectly predicts the total number of TM regions for 6 of the 29 experimentally de-termined GPCRs. With this study, we try to develop a GPCR-specific TM prediction algorithm using machine learning techniques. The algorithm is based on hydropho-bicity of each amino acid in the protein sequence and the secondary structure. As hydrophobicity scale, both Moon-Fleming and Kyte-Doolittle hydrophobicity scales are implemented separately. The secondary structures are derived from the JPred server. With this algorithm, we obtain more than 85% accuracy with higher true pos-itive rate. The results obtained could shed light on many other scientific researches and facilitate structure-based drug discovery with further therapeutic opportunities for many diseases.