An integrative framework for clinical diagnosis and knowledge discovery from exome sequencing data


Shojaei M., Mohammadvand N., DOĞAN T., Alkan C., Çetin Atalay R., ACAR A. C.

Computers in Biology and Medicine, cilt.169, 2024 (SCI-Expanded) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 169
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1016/j.compbiomed.2023.107810
  • Dergi Adı: Computers in Biology and Medicine
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Applied Science & Technology Source, BIOSIS, Biotechnology Research Abstracts, CINAHL, Compendex, Computer & Applied Sciences, EMBASE, INSPEC, Library, Information Science & Technology Abstracts (LISTA)
  • Anahtar Kelimeler: Gene ontology, Insertion-deletion variants, Mutation, Pearson correlation, Protein sequence, Variant pathogenicity prediction
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Non-silent single nucleotide genetic variants, like nonsense changes and insertion-deletion variants, that affect protein function and length substantially are prevalent and are frequently misclassified. The low sensitivity and specificity of existing variant effect predictors for nonsense and indel variations restrict their use in clinical applications. We propose the Pathogenic Mutation Prediction (PMPred) method to predict the pathogenicity of single nucleotide variations, which impair protein function by prematurely terminating a protein's elongation during its synthesis. The prediction starts by monitoring functional effects (Gene Ontology annotation changes) of the change in sequence, using an existing ensemble machine learning model (UniGOPred). This, in turn, reveals the mutations that significantly deviate functionally from the wild-type sequence. We have identified novel harmful mutations in patient data and present them as motivating case studies. We also show that our method has increased sensitivity and specificity compared to state-of-the-art, especially in single nucleotide variations that produce large functional changes in the final protein. As further validation, we have done a comparative docking study on such a variation that is misclassified by existing methods and, using the altered binding affinities, show how PMPred can correctly predict the pathogenicity when other tools miss it. PMPred is freely accessible as a web service at https://pmpred.kansil.org/, and the related code is available at https://github.com/kansil/PMPred.