Classifying protein kinase conformations with machine learning

被引:2
|
作者
Reveguk, Ivan [1 ]
Simonson, Thomas [1 ]
机构
[1] Ecole Polytech, Lab Biol Struct Cellule, CNRS, UMR7654, Palaiseau, France
关键词
ATPase; data mining; structural biology; XGBoost; CRYSTAL-STRUCTURE; C-ABL; ACTIVATION; SELECTION; INHIBITION; BINDING; DOMAIN; MECHANISMS; TRANSITION; PLASTICITY;
D O I
10.1002/pro.4918
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Protein kinases are key actors of signaling networks and important drug targets. They cycle between active and inactive conformations, distinguished by a few elements within the catalytic domain. One is the activation loop, whose conserved DFG motif can occupy DFG-in, DFG-out, and some rarer conformations. Annotation and classification of the structural kinome are important, as different conformations can be targeted by different inhibitors and activators. Valuable resources exist; however, large-scale applications will benefit from increased automation and interpretability of structural annotation. Interpretable machine learning models are described for this purpose, based on ensembles of decision trees. To train them, a set of catalytic domain sequences and structures was collected, somewhat larger and more diverse than existing resources. The structures were clustered based on the DFG conformation and manually annotated. They were then used as training input. Two main models were constructed, which distinguished active/inactive and in/out/other DFG conformations. They considered initially 1692 structural variables, spanning the whole catalytic domain, then identified ("learned") a small subset that sufficed for accurate classification. The first model correctly labeled all but 3 of 3289 structures as active or inactive, while the second assigned the correct DFG label to all but 17 of 8826 structures. The most potent classifying variables were all related to well-known structural elements in or near the activation loop and their ranking gives insights into the conformational preferences. The models were used to automatically annotate 3850 kinase structures predicted recently with the Alphafold2 tool, showing that Alphafold2 reproduced the active/inactive but not the DFG-in proportions seen in the Protein Data Bank. We expect the models will be useful for understanding and engineering kinases.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] Classifying Legal Norms with Active Machine Learning
    Waltl, Bernhard
    Muhr, Johannes
    Glaser, Ingo
    Bonczek, Georg
    Scepankova, Elena
    Matthes, Florian
    LEGAL KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 302 : 11 - 20
  • [32] Predicting new protein conformations from molecular dynamics simulation conformational landscapes and machine learning
    Jin, Yiming
    Johannissen, Linus O.
    Hay, Sam
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2021, 89 (08) : 915 - 921
  • [33] Classifying protein kinase structures guides use of ligand-selectivity profiles to predict inactive conformations: Structure of lck/imatinib complex
    Jacobs, Marc D.
    Caron, Paul R.
    Hare, Brian J.
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2008, 70 (04) : 1451 - 1460
  • [34] Impact of protein and small molecule interactions on kinase conformations
    Kugler, Valentina
    Schwaighofer, Selina
    Feichtner, Andreas
    Enzler, Florian
    Fleischmann, Jakob
    Strich, Sophie
    Schwarz, Sarah
    Wilson, Rebecca
    Tschaikner, Philipp
    Troppmair, Jakob
    Sexl, Veronika
    Meier, Pascal
    Kaserer, Teresa
    Stefan, Eduard
    ELIFE, 2024, 13
  • [35] Identifying knot types of polymer conformations by machine learning
    Vandans, Olafs
    Yang, Kaiyuan
    Wu, Zhongtao
    Dai, Liang
    PHYSICAL REVIEW E, 2020, 101 (02)
  • [36] A hybrid machine learning model for classifying time series
    Abdullah Elen
    Emre Avuçlu
    Neural Computing and Applications, 2022, 34 : 1219 - 1237
  • [37] Classifying Restatements: An Application of Machine Learning and Textual Analytics
    Hayes, Louise
    Boritz, J. Efrim
    JOURNAL OF INFORMATION SYSTEMS, 2021, 35 (03) : 107 - 131
  • [38] A Machine Learning Approach for Classifying Road Accident Hotspots
    Amorim, Brunna de Sousa Pereira
    Firmino, Anderson Almeida
    Baptista, Claudio de Souza
    Braz, Geraldo
    de Paiva, Anselmo Cardoso
    de Almeida, Francisco Edeverton
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2023, 12 (06)
  • [39] Classifying snapshots of the doped Hubbard model with machine learning
    Bohrdt, Annabelle
    Chiu, Christie S.
    Jig, Geoffrey
    Xu, Muqing
    Greif, Daniel
    Greiner, Markus
    Demler, Eugene
    Grusdt, Fabian
    Knap, Michael
    NATURE PHYSICS, 2019, 15 (09) : 921 - 924
  • [40] Classifying features of freeway crashes using machine learning
    Najafi, Zahra
    Sadeghi, Rasool
    Arghami, Shirazeh
    INTERNATIONAL JOURNAL OF CRASHWORTHINESS, 2022, 27 (06) : 1678 - 1686