ParaAntiProt provides paratope prediction using antibody and protein language models

被引:0
|
作者
Kalemati, Mahmood [1 ]
Noroozi, Alireza [1 ]
Shahbakhsh, Aref [1 ]
Koohi, Somayyeh [1 ]
机构
[1] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
来源
SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期
关键词
Paratope prediction; Antibody Language models; Protein Language models; Complementarity determining regions; Deep learning;
D O I
10.1038/s41598-024-80940-y
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Efficiently predicting the paratope holds immense potential for enhancing antibody design, treating cancers and other serious diseases, and advancing personalized medicine. Although traditional methods are highly accurate, they are often time-consuming, labor-intensive, and reliant on 3D structures, restricting their broader use. On the other hand, machine learning-based methods, besides relying on structural data, entail descriptor computation, consideration of diverse physicochemical properties, and feature engineering. Here, we develop a deep learning-assisted prediction method for paratope identification, relying solely on amino acid sequences and being antigen-agnostic. Built on the ProtTrans architecture, and utilizing pre-trained protein and antibody language models, we extract efficient embeddings for predicting paratope. By incorporating positional encoding for Complementarity Determining Regions, our model gains a deeper structural understanding, achieving remarkable performance with a 0.904 ROC AUC, 0.701 F1-score, and 0.585 MCC on benchmark datasets. In addition to yielding accurate antibody paratope predictions, our method exhibits strong performance in predicting nanobody paratope, achieving a ROC AUC of 0.912 and a PR AUC of 0.665 on the nanobody dataset. Notably, our approach outperforms structure-based prediction methods, boasting a PR AUC of 0.731. Various conducted ablation studies, which elaborate on the impact of each part of the model on the prediction task, show that the improvement in prediction performance by applying CDR positional encoding together with CNNs depends on the specific protein and antibody language models used. These results highlight the potential of our method to advance disease understanding and aid in the discovery of new diagnostics and antibody therapies.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] An antibody developability triaging pipeline exploiting protein language models
    Sweet-Jones, James
    Martin, Andrew C. R.
    MABS, 2025, 17 (01)
  • [22] Improved inter-protein contact prediction using dimensional hybrid residual networks and protein language models
    Si, Yunda
    Yan, Chengfei
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (02)
  • [23] SnugDock: Paratope Structural Optimization during Antibody-Antigen Docking Compensates for Errors in Antibody Homology Models
    Sircar, Aroop
    Gray, Jeffrey J.
    PLOS COMPUTATIONAL BIOLOGY, 2010, 6 (01)
  • [24] Protein–protein contact prediction by geometric triangle-aware protein language models
    Lin P.
    Tao H.
    Li H.
    Huang S.-Y.
    Nature Machine Intelligence, 2023, 5 (11) : 1275 - 1284
  • [25] DeepLoc 2.0: multi-label subcellular localization prediction using protein language models
    Thumuluri, Vineet
    Armenteros, Jose Juan Almagro
    Johansen, Alexander Rosenberg
    Nielsen, Henrik
    Winther, Ole
    NUCLEIC ACIDS RESEARCH, 2022, 50 (W1) : W228 - W234
  • [26] Prediction of virus-host associations using protein language models and multiple instance learning
    Liu, Dan
    Young, Francesca
    Lamb, Kieran D.
    Robertson, David L.
    Yuan, Ke
    PLOS COMPUTATIONAL BIOLOGY, 2024, 20 (11)
  • [27] ESMSec: Prediction of Secreted Proteins in Human Body Fluids Using Protein Language Models and Attention
    Wang, Yan
    Sun, Huiting
    Sheng, Nan
    He, Kai
    Hou, Wenjv
    Zhao, Ziqi
    Yang, Qixing
    Huang, Lan
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (12)
  • [28] Single-sequence protein structure prediction by integrating protein language models
    Jing, Xiaoyang
    Wu, Fandi
    Luo, Xiao
    Xu, Jinbo
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2024, 121 (13)
  • [29] Structure-Based Antibody Paratope Prediction with 3D Zernike Descriptors and SVM
    Daberdaku, Sebastian
    COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS, CIBB 2018, 2020, 11925 : 27 - 49
  • [30] Prediction of Paratope-Epitope Pairs Using Convolutional Neural Networks
    Li, Dong
    Pucci, Fabrizio
    Rooman, Marianne
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (10)