BgN-Score and BsN-Score: Bagging and boosting based ensemble neural networks scoring functions for accurate binding affinity prediction of protein-ligand complexes

被引:49
|
作者
Ashtawy, Hossam M. [1 ]
Mahapatra, Nihar R. [1 ]
机构
[1] Michigan State Univ, Dept Elect & Comp Engn, E Lansing, MI 48824 USA
来源
BMC BIOINFORMATICS | 2015年 / 16卷
基金
美国国家科学基金会;
关键词
MOLECULAR DOCKING; RECOGNITION; VALIDATION; DISCOVERY;
D O I
10.1186/1471-2105-16-S4-S8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Accurately predicting the binding affinities of large sets of protein-ligand complexes is a key challenge in computational biomolecular science, with applications in drug discovery, chemical biology, and structural biology. Since a scoring function (SF) is used to score, rank, and identify drug leads, the fidelity with which it predicts the affinity of a ligand candidate for a protein's binding site has a significant bearing on the accuracy of virtual screening. Despite intense efforts in developing conventional SFs, which are either force-field based, knowledge-based, or empirical, their limited predictive power has been a major roadblock toward cost-effective drug discovery. Therefore, in this work, we present novel SFs employing a large ensemble of neural networks (NN) in conjunction with a diverse set of physicochemical and geometrical features characterizing protein-ligand complexes to predict binding affinity. Results: We assess the scoring accuracies of two new ensemble NN SFs based on bagging (BgN-Score) and boosting (BsN-Score), as well as those of conventional SFs in the context of the 2007 PDBbind benchmark that encompasses a diverse set of high-quality protein families. We find that BgN-Score and BsN-Score have more than 25% better Pearson's correlation coefficient (0.804 and 0.816 vs. 0.644) between predicted and measured binding affinities compared to that achieved by a state-of-the-art conventional SF. In addition, these ensemble NN SFs are also at least 19% more accurate (0.804 and 0.816 vs. 0.675) than SFs based on a single neural network that has been traditionally used in drug discovery applications. We further find that ensemble models based on NNs surpass SFs based on the decision-tree ensemble technique Random Forests. Conclusions: Ensemble neural networks SFs, BgN-Score and BsN-Score, are the most accurate in predicting binding affinity of protein-ligand complexes among the considered SFs. Moreover, their accuracies are even higher when they are used to predict binding affinities of protein-ligand complexes that are related to their training sets.
引用
收藏
页数:12
相关论文
共 40 条
  • [21] Computationally predicting binding affinity in protein-ligand complexes: free energy-based simulations and machine learning-based scoring functions
    Wang, Debby D.
    Zhu, Mengxu
    Yan, Hong
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (03)
  • [22] BoostSF-SHAP: Gradient boosting-based software for protein-ligand binding affinity prediction with explanations
    Chen, Xingqian
    Song, Shuangbao
    Song, Zhenyu
    Song, Shuangyu
    Ji, Junkai
    NEUROCOMPUTING, 2025, 622
  • [23] HAC-Net: A Hybrid Attention-Based Convolutional Neural Network for Highly Accurate Protein-Ligand Binding Affinity Prediction
    Kyro, Gregory W.
    Brent, Rafael I.
    Batista, Victor S.
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2023, 63 (07) : 1947 - 1960
  • [24] KDEEP: Protein-Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks
    Jimenez, Jose
    Skalic, Miha
    Martinez-Rosell, Gerard
    De Fabritiis, Gianni
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2018, 58 (02) : 287 - 296
  • [25] CurvAGN: Curvature-based Adaptive Graph Neural Networks for Predicting Protein-Ligand Binding Affinity
    Wu, Jianqiu
    Chen, Hongyang
    Cheng, Minhao
    Xiong, Haoyi
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [26] CurvAGN: Curvature-based Adaptive Graph Neural Networks for Predicting Protein-Ligand Binding Affinity
    Jianqiu Wu
    Hongyang Chen
    Minhao Cheng
    Haoyi Xiong
    BMC Bioinformatics, 24
  • [27] GAABind: a geometry-aware attention-based network for accurate protein-ligand binding pose and binding affinity prediction
    Tan, Huishuang
    Wang, Zhixin
    Hu, Guang
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (01)
  • [28] A Comparative Assessment of Conventional and Machine-Learning-Based Scoring Functions in Predicting Binding Affinities of Protein-Ligand Complexes
    Ashtawy, Hossam M.
    Mahapatra, Nihar R.
    2011 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM 2011), 2011, : 627 - 630
  • [29] Knowledge-Based Scoring Function Derived from Atomic Tessellation of Macromolecular Structures for Prediction of Protein-Ligand Binding Affinity
    Masso, Majid
    2012 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2012,
  • [30] Large-scale validation of a quantum mechanics based scoring function: Predicting the binding affinity and the binding mode of a diverse set of protein-ligand complexes
    Raha, K
    Merz, KM
    JOURNAL OF MEDICINAL CHEMISTRY, 2005, 48 (14) : 4558 - 4575