Classifying protein kinase conformations with machine learning

被引:2
|
作者
Reveguk, Ivan [1 ]
Simonson, Thomas [1 ]
机构
[1] Ecole Polytech, Lab Biol Struct Cellule, CNRS, UMR7654, Palaiseau, France
关键词
ATPase; data mining; structural biology; XGBoost; CRYSTAL-STRUCTURE; C-ABL; ACTIVATION; SELECTION; INHIBITION; BINDING; DOMAIN; MECHANISMS; TRANSITION; PLASTICITY;
D O I
10.1002/pro.4918
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Protein kinases are key actors of signaling networks and important drug targets. They cycle between active and inactive conformations, distinguished by a few elements within the catalytic domain. One is the activation loop, whose conserved DFG motif can occupy DFG-in, DFG-out, and some rarer conformations. Annotation and classification of the structural kinome are important, as different conformations can be targeted by different inhibitors and activators. Valuable resources exist; however, large-scale applications will benefit from increased automation and interpretability of structural annotation. Interpretable machine learning models are described for this purpose, based on ensembles of decision trees. To train them, a set of catalytic domain sequences and structures was collected, somewhat larger and more diverse than existing resources. The structures were clustered based on the DFG conformation and manually annotated. They were then used as training input. Two main models were constructed, which distinguished active/inactive and in/out/other DFG conformations. They considered initially 1692 structural variables, spanning the whole catalytic domain, then identified ("learned") a small subset that sufficed for accurate classification. The first model correctly labeled all but 3 of 3289 structures as active or inactive, while the second assigned the correct DFG label to all but 17 of 8826 structures. The most potent classifying variables were all related to well-known structural elements in or near the activation loop and their ranking gives insights into the conformational preferences. The models were used to automatically annotate 3850 kinase structures predicted recently with the Alphafold2 tool, showing that Alphafold2 reproduced the active/inactive but not the DFG-in proportions seen in the Protein Data Bank. We expect the models will be useful for understanding and engineering kinases.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] A Simplified Machine Learning Approach to Classifying Individual Websites
    Burns, Tina
    Song, Chuxu
    Seskar, Ivan
    Ortiz, Jorge
    Martin, Richard P.
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 6109 - 6114
  • [42] Machine learning as a tool for classifying electron tomographic reconstructions
    Staniewicz, Lech
    Midgley, Paul A.
    ADVANCED STRUCTURAL AND CHEMICAL IMAGING, 2015, 1
  • [43] CLASSIFYING EEG SIGNAL SEGMENTS USING MACHINE LEARNING
    Anghel, Ana Magdalena
    Zaharia, Andrei
    UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2024, 86 (03): : 113 - 120
  • [44] Classifying snapshots of the doped Hubbard model with machine learning
    Annabelle Bohrdt
    Christie S. Chiu
    Geoffrey Ji
    Muqing Xu
    Daniel Greif
    Markus Greiner
    Eugene Demler
    Fabian Grusdt
    Michael Knap
    Nature Physics, 2019, 15 : 921 - 924
  • [45] Classifying the clouds of Venus using unsupervised machine learning
    Mittendorf, J.
    Molaverdikhani, K.
    Ercolano, B.
    Giovagnoli, A.
    Grassi, T.
    ASTRONOMY AND COMPUTING, 2024, 49
  • [46] AIggregate: A Machine Learning Approach for Classifying Micelle Shape
    Mertzios, Alkiviadis
    Papavasileiou, Konstantinos
    Peristeras, Loukas
    Giannakopoulos, George
    PROCEEDINGS OF THE 12TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE, SETN 2022, 2022,
  • [47] A hybrid machine learning model for classifying time series
    Elen, Abdullah
    Avuclu, Emre
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (02): : 1219 - 1237
  • [48] Classifying online Job Advertisements through Machine Learning
    Boselli, Roberto
    Cesarini, Mirko
    Mercorio, Fabio
    Mezzanzanica, Mario
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 86 : 319 - 328
  • [49] Classifying and quantifying changes in papilloedema using machine learning
    Branco, Joseph
    Wang, Jui-Kai
    Elze, Tobias
    Garvin, Mona K.
    Pasquale, Louis R.
    Kardon, Randy
    Woods, Brian
    Szanto, David
    Kupersmith, Mark J.
    BMJ NEUROLOGY OPEN, 2024, 6 (01)
  • [50] CLASSIFYING EEG SIGNAL SEGMENTS USING MACHINE LEARNING
    Anghel, Ana Magdalena
    Zaharia, Andrei
    UPB Scientific Bulletin, Series C: Electrical Engineering and Computer Science, 2024, 86 (03): : 113 - 120