MDeePred: Novel multi-channel protein featurization for deep learning-based binding affinity prediction in drug discovery


Rifaioglu A., Atalay R. C., KAHRAMAN D. C., DOĞAN T., Martin M., ATALAY M. V.

Bioinformatics, vol.37, no.5, pp.693-704, 2021 (SCI-Expanded) identifier identifier identifier

  • Publication Type: Article / Article
  • Volume: 37 Issue: 5
  • Publication Date: 2021
  • Doi Number: 10.1093/bioinformatics/btaa858
  • Journal Name: Bioinformatics
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.693-704
  • Middle East Technical University Affiliated: Yes

Abstract

© 2021 Oxford University Press. All rights reserved.Motivation: Identification of interactions between bioactive small molecules and target proteins is crucial for novel drug discovery, drug repurposing and uncovering off-target effects. Due to the tremendous size of the chemical space, experimental bioactivity screening efforts require the aid of computational approaches. Although deep learning models have been successful in predicting bioactive compounds, effective and comprehensive featurization of proteins, to be given as input to deep neural networks, remains a challenge. Results: Here, we present a novel protein featurization approach to be used in deep learning-based compound-target protein binding affinity prediction. In the proposed method, multiple types of protein features such as sequence, structural, evolutionary and physicochemical properties are incorporated within multiple 2D vectors, which is then fed to state-of-the-art pairwise input hybrid deep neural networks to predict the real-valued compound-target protein interactions. The method adopts the proteochemometric approach, where both the compound and target protein features are used at the input level to model their interaction. The whole system is called MDeePred and it is a new method to be used for the purposes of computational drug discovery and repositioning. We evaluated MDeePred on well-known benchmark datasets and compared its performance with the state-of-the-art methods. We also performed in vitro comparative analysis of MDeePred predictions with selected kinase inhibitors' action on cancer cells. MDeePred is a scalable method with sufficiently high predictive performance. The featurization approach proposed here can also be utilized for other protein-related predictive tasks. Availability and implementation: The source code, datasets, additional information and user instructions of MDeePred are available at https://github.com/cansyl/MDeePred.