MetalTrans: A Biological Language Model-Based Approach for Predicting Disease-Associated Mutations in Protein Metal-Binding Sites

被引:1
|
作者
Zhang, Ming [1 ]
Wang, Xiaohua [1 ]
Xu, Shanruo [2 ]
Ge, Fang [3 ,4 ]
Paixao, Ian Costa [5 ,6 ,7 ]
Song, Jiangning [5 ,6 ,7 ]
Yu, Dong-Jun [8 ]
机构
[1] Jiangsu Univ Sci & Technol, Sch Comp, Zhenjiang 212100, Peoples R China
[2] Duke Kunshan Univ, Kunshan 215316, Jiangsu, Peoples R China
[3] Nanjing Univ Posts & Telecommun, State Key Lab Organ Elect & Informat Displays, Nanjing 210023, Peoples R China
[4] Nanjing Univ Posts & Telecommun, Inst Adv Mat IAM, Nanjing 210023, Peoples R China
[5] Monash Univ, Monash Biomed Discovery Inst, Melbourne, Vic 3800, Australia
[6] Monash Univ, Dept Biochem & Mol Biol, Melbourne, Vic 3800, Australia
[7] Monash Univ, Monash Data Futures Inst, Melbourne, Vic 3800, Australia
[8] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
基金
中国国家自然科学基金;
关键词
CAUSE NOONAN; METALLOPROTEINS; SELECTIVITY; RESOURCE; INSIGHTS; DATABASE; UNIPROT; UREE;
D O I
10.1021/acs.jcim.4c00739
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
The critical importance of accurately predicting mutations in protein metal-binding sites for advancing drug discovery and enhancing disease diagnostic processes cannot be overstated. In response to this imperative, MetalTrans emerges as an accurate predictor for disease-associated mutations in protein metal-binding sites. The core innovation of MetalTrans lies in its seamless integration of multifeature splicing with the Transformer framework, a strategy that ensures exhaustive feature extraction. Central to MetalTrans's effectiveness is its deep feature combination strategy, which merges evolutionary-scale modeling amino acid embeddings with ProtTrans embeddings, thus shedding light on the biochemical properties of proteins. Employing the Transformer component, MetalTrans leverages the self-attention mechanism to delve into higher-level representations. Utilizing mutation site information for feature fusion not only enriches the feature set but also sidesteps the common pitfall of overestimation linked to protein sequence-based predictions. This nuanced approach to feature fusion is a key differentiator, enabling MetalTrans to outperform existing methods significantly, as evidenced by comparative analyses. Our evaluations across varied metal binding site data sets (specifically Zn, Ca, Mg, and Mix) underscore MetalTrans's superior performance, which achieved the average AUC values of 0.971, 0.965, 0.980, and 0.945 on multiple 5-fold cross-validation, respectively. Remarkably, against the multichannel convolutional neural network method on a benchmark independent test set, MetalTrans demonstrated unparalleled robustness and superiority, boasting the AUC score of 0.998 on multiple 5-fold cross-validation. Our comprehensive examination of the predicted outcomes further confirms the effectiveness of the model.
引用
收藏
页码:6216 / 6229
页数:14
相关论文
共 34 条
  • [1] MetalPrognosis: A Biological Language Model-Based Approach for Disease-Associated Mutations in Metal-Binding Site Prediction
    Jia, Runchang
    He, Zhijie
    Wang, Cong
    Guo, Xudong
    Li, Fuyi
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2024, 21 (06) : 2340 - 2348
  • [2] GPTrans: A Biological Language Model-Based Approach for Predicting Disease-Associated Mutations in G Protein-Coupled Receptors
    Wang, Xiaohua
    Zhang, Ming
    Yang, Xibei
    Yu, Dong-Jun
    Ge, Fang
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2024, 64 (24) : 9626 - 9642
  • [3] Predicting disease-associated mutation of metal-binding sites in proteins using a deep learning approach
    Koohi-Moghadam, Mohamad
    Wang, Haibo
    Wang, Yuchuan
    Yang, Xinming
    Li, Hongyan
    Wang, Junwen
    Sun, Hongzhe
    NATURE MACHINE INTELLIGENCE, 2019, 1 (12) : 561 - 567
  • [4] Predicting disease-associated mutation of metal-binding sites in proteins using a deep learning approach
    Mohamad Koohi-Moghadam
    Haibo Wang
    Yuchuan Wang
    Xinming Yang
    Hongyan Li
    Junwen Wang
    Hongzhe Sun
    Nature Machine Intelligence, 2019, 1 : 561 - 567
  • [5] Predicting Metal-Binding Sites from Protein Sequence
    Passerini, Andrea
    Lippi, Marco
    Frasconi, Paolo
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (01) : 203 - 213
  • [6] Mapping the ligand-binding sites and disease-associated mutations on the most abundant protein in the human, type I collagen
    Di Lullo, GA
    Sweeney, SM
    Körkkö, J
    Ala-Kokko, L
    San Antonio, JD
    JOURNAL OF BIOLOGICAL CHEMISTRY, 2002, 277 (06) : 4223 - 4231
  • [7] ROLES OF THIOETHER AND DISULFIDE SULFURS AS BIOLOGICAL METAL-BINDING SITES - STRUCTURES AND REACTIVITIES OF MODEL COMPLEXES
    YAMAUCHI, O
    SHODA, T
    KANASAKI, H
    ODANI, A
    JOURNAL OF PHARMACOBIO-DYNAMICS, 1985, 8 (01): : S8 - S8
  • [8] Mapping the ligand-binding sites and disease-associated mutations on the most abundant protein in the human-type I collagen
    San Antonio, JD
    Di Lullo, GA
    Sweeney, SM
    Körkkö, J
    Ala-Kokko, L
    MOLECULAR BIOLOGY OF THE CELL, 2001, 12 : 9A - 9A
  • [9] A protein-analog approach to biomimetic study of metal-binding sites in cytochrome c oxidase.
    Lu, Y
    Hay, MT
    Ang, MC
    Sigman, JA
    Wang, XT
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1998, 216 : U161 - U161
  • [10] A protein analog approach to biomimetic study of metal-binding sites in cytochrome c oxidase and manganese peroxidase.
    Lu, Y
    Hay, MT
    Yeung, BKS
    Wang, XT
    Ang, MC
    Massey, PD
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1997, 213 : 823 - INOR