Interpretable Machine Learning of Amino Acid Patterns in Proteins: A Statistical Ensemble Approach

被引:2
|
作者
Braghetto, Anna [1 ,2 ]
Orlandini, Enzo [1 ,2 ]
Baiesi, Marco [1 ,2 ]
机构
[1] Univ Padua, Dept Phys & Astron, Via Marzolo 8, I-35131 Padua, Italy
[2] INFN, Sez Padova, Via Marzolo 8, I-35131 Padua, Italy
关键词
SECONDARY STRUCTURE; POLAR; PREDICTION; DESIGN;
D O I
10.1021/acs.jctc.3c00383
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
Explainable and interpretable unsupervised machine learninghelpsone to understand the underlying structure of data. We introduce anensemble analysis of machine learning models to consolidate theirinterpretation. Its application shows that restricted Boltzmann machinescompress consistently into a few bits the information stored in asequence of five amino acids at the start or end of & alpha;-helicesor & beta;-sheets. The weights learned by the machines reveal unexpectedproperties of the amino acids and the secondary structure of proteins:(i) His and Thr have a negligible contribution to the amphiphilicpattern of & alpha;-helices; (ii) there is a class of & alpha;-helicesparticularly rich in Ala at their end; (iii) Pro occupies most oftenslots otherwise occupied by polar or charged amino acids, and itspresence at the start of helices is relevant; (iv) Glu and especiallyAsp on one side and Val, Leu, Iso, and Phe on the other display thestrongest tendency to mark amphiphilic patterns, i.e., extreme valuesof an effective hydrophobicity, though they are notthe most powerful (non)hydrophobic amino acids.
引用
收藏
页码:6011 / 6022
页数:12
相关论文
共 50 条
  • [1] AAontology: An Ontology of Amino Acid Scales for Interpretable Machine Learning
    Breimann, Stephan
    Kamp, Frits
    Steiner, Harald
    Frishman, Dmitrij
    JOURNAL OF MOLECULAR BIOLOGY, 2024, 436 (19)
  • [2] Interpretable machine learning with an ensemble of gradient boosting machines
    Konstantinov, Andrei, V
    Utkin, Lev, V
    KNOWLEDGE-BASED SYSTEMS, 2021, 222
  • [3] Predicting inmate suicidal behavior with an interpretable ensemble machine learning approach in smart prisons
    Akhtar, Khayyam
    Yaseen, Muhammad Usman
    Imran, Muhammad
    Khattak, Sohaib Bin Altaf
    Nasralla, Moustafa M.
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [4] Interpretable Machine Learning for Discovery: Statistical Challenges and Opportunities
    Allen, Genevera I.
    Gan, Luqin
    Zheng, Lili
    ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, 2024, 11 : 97 - 121
  • [5] Ensemble machine learning for interpretable soil heat flux estimation
    Cross, James F.
    Drewry, Darren T.
    ECOLOGICAL INFORMATICS, 2024, 82
  • [6] An ENSEMBLE machine learning approach for the prediction of all-alpha membrane proteins
    Martelli, Pier Luigi
    Fariselli, Piero
    Casadio, Rita
    BIOINFORMATICS, 2003, 19 : i205 - i211
  • [7] Statistical and machine learning approach to assessing the environmental impact on walking patterns
    Yang, Mingjing
    Zheng, Huiru
    Wang, Haiying
    McClean, Sally
    Mayagoitia, Ruth E.
    2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2014, : 231 - 235
  • [8] Illuminating the black box: An interpretable machine learning based on ensemble trees
    Lee, Yue-Shi
    Yen, Show-Jane
    Jiang, Wendong
    Chen, Jiyuan
    Chang, Chih-Yung
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 272
  • [9] Statistical ensemble method (SEM):: A new meta-machine learning approach based on statistical techniques
    Escolano, AY
    Riaño, PG
    Junquera, JP
    Vázquez, EG
    COMPUTATIONAL INTELLIGENCE AND BIOINSPIRED SYSTEMS, PROCEEDINGS, 2005, 3512 : 192 - 199
  • [10] Prediction of Cavity Length Using an Interpretable Ensemble Learning Approach
    Guo, Ganggui
    Li, Shanshan
    Liu, Yakun
    Cao, Ze
    Deng, Yangyu
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2023, 20 (01)