Classifying protein kinase conformations with machine learning

被引:2
|
作者
Reveguk, Ivan [1 ]
Simonson, Thomas [1 ]
机构
[1] Ecole Polytech, Lab Biol Struct Cellule, CNRS, UMR7654, Palaiseau, France
关键词
ATPase; data mining; structural biology; XGBoost; CRYSTAL-STRUCTURE; C-ABL; ACTIVATION; SELECTION; INHIBITION; BINDING; DOMAIN; MECHANISMS; TRANSITION; PLASTICITY;
D O I
10.1002/pro.4918
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Protein kinases are key actors of signaling networks and important drug targets. They cycle between active and inactive conformations, distinguished by a few elements within the catalytic domain. One is the activation loop, whose conserved DFG motif can occupy DFG-in, DFG-out, and some rarer conformations. Annotation and classification of the structural kinome are important, as different conformations can be targeted by different inhibitors and activators. Valuable resources exist; however, large-scale applications will benefit from increased automation and interpretability of structural annotation. Interpretable machine learning models are described for this purpose, based on ensembles of decision trees. To train them, a set of catalytic domain sequences and structures was collected, somewhat larger and more diverse than existing resources. The structures were clustered based on the DFG conformation and manually annotated. They were then used as training input. Two main models were constructed, which distinguished active/inactive and in/out/other DFG conformations. They considered initially 1692 structural variables, spanning the whole catalytic domain, then identified ("learned") a small subset that sufficed for accurate classification. The first model correctly labeled all but 3 of 3289 structures as active or inactive, while the second assigned the correct DFG label to all but 17 of 8826 structures. The most potent classifying variables were all related to well-known structural elements in or near the activation loop and their ranking gives insights into the conformational preferences. The models were used to automatically annotate 3850 kinase structures predicted recently with the Alphafold2 tool, showing that Alphafold2 reproduced the active/inactive but not the DFG-in proportions seen in the Protein Data Bank. We expect the models will be useful for understanding and engineering kinases.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Classifying Convective Storms Using Machine Learning
    Jergensen, G. Eli
    McGovern, Amy
    Lagerquist, Ryan
    Smith, Travis
    WEATHER AND FORECASTING, 2020, 35 (02) : 537 - 559
  • [22] Machine learning approaches for classifying lunar soils
    Kodikara, Gayantha R. L.
    McHenry, Lindsay J.
    ICARUS, 2020, 345
  • [23] Classifying Ransomware Using Machine Learning Algorithms
    Egunjobi, Samuel
    Parkinson, Simon
    Crampton, Andrew
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING (IDEAL 2019), PT II, 2019, 11872 : 45 - 52
  • [24] The Extreme Learning Machine Algorithm for Classifying Fingerprints
    Zabala-Blanco, David
    Mora, Marco
    Hernandez-Garcia, Ruber
    Barrientos, Ricardo J.
    2020 39TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC), 2020,
  • [25] Classifying and completing word analogies by machine learning
    Lim, Suryani
    Prade, Henri
    Richard, Gilles
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2021, 132 : 1 - 25
  • [26] Classifying Drought in Ethiopia Using Machine Learning
    Richman, Michael B.
    Leslie, Lance M.
    Segele, Zewdu T.
    COMPLEX ADAPTIVE SYSTEMS, 2016, 95 : 229 - 236
  • [27] Classifying smoking urges via machine learning
    Dumortier, Antoine
    Beckjord, Ellen
    Shiffman, Saul
    Sejdic, Ervin
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2016, 137 : 203 - 213
  • [28] Machine Learning for Classifying Images with Motion Blur
    Garcia, Rogelio E.
    Alvarez, Jacqueline
    Marcia, Roummel F.
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 490 - 494
  • [29] Classifying Social Media Users with Machine Learning
    Li G.
    Zhou H.
    Mao J.
    Chen S.
    Data Analysis and Knowledge Discovery, 2019, 3 (08) : 1 - 9
  • [30] MACHINE LEARNING APPROACH FOR CLASSIFYING HISTONE MODIFICATIONS
    Gorthi, Aparna
    Jain, Ravi
    Dimitrova, Nevenka
    2009 IEEE INTERNATIONAL WORKSHOP ON GENOMIC SIGNAL PROCESSING AND STATISTICS (GENSIPS 2009), 2009, : 33 - 36