Classifying protein kinase conformations with machine learning

被引：2

作者：

Reveguk, Ivan ^{[1
]}

Simonson, Thomas ^{[1
]}

机构：

[1] Ecole Polytech, Lab Biol Struct Cellule, CNRS, UMR7654, Palaiseau, France

来源：

PROTEIN SCIENCE | 2024年 / 33卷 / 04期

关键词：

ATPase; data mining; structural biology; XGBoost; CRYSTAL-STRUCTURE; C-ABL; ACTIVATION; SELECTION; INHIBITION; BINDING; DOMAIN; MECHANISMS; TRANSITION; PLASTICITY;

D O I：

10.1002/pro.4918

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

Protein kinases are key actors of signaling networks and important drug targets. They cycle between active and inactive conformations, distinguished by a few elements within the catalytic domain. One is the activation loop, whose conserved DFG motif can occupy DFG-in, DFG-out, and some rarer conformations. Annotation and classification of the structural kinome are important, as different conformations can be targeted by different inhibitors and activators. Valuable resources exist; however, large-scale applications will benefit from increased automation and interpretability of structural annotation. Interpretable machine learning models are described for this purpose, based on ensembles of decision trees. To train them, a set of catalytic domain sequences and structures was collected, somewhat larger and more diverse than existing resources. The structures were clustered based on the DFG conformation and manually annotated. They were then used as training input. Two main models were constructed, which distinguished active/inactive and in/out/other DFG conformations. They considered initially 1692 structural variables, spanning the whole catalytic domain, then identified ("learned") a small subset that sufficed for accurate classification. The first model correctly labeled all but 3 of 3289 structures as active or inactive, while the second assigned the correct DFG label to all but 17 of 8826 structures. The most potent classifying variables were all related to well-known structural elements in or near the activation loop and their ranking gives insights into the conformational preferences. The models were used to automatically annotate 3850 kinase structures predicted recently with the Alphafold2 tool, showing that Alphafold2 reproduced the active/inactive but not the DFG-in proportions seen in the Protein Data Bank. We expect the models will be useful for understanding and engineering kinases.

引用

页数：21

共 50 条

[31] Classifying Legal Norms with Active Machine Learning
Waltl, Bernhard
Muhr, Johannes
Glaser, Ingo
Bonczek, Georg
Scepankova, Elena
Matthes, Florian
LEGAL KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 302 : 11 - 20
[32] Predicting new protein conformations from molecular dynamics simulation conformational landscapes and machine learning
Jin, Yiming
Johannissen, Linus O.
Hay, Sam
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2021, 89 (08) : 915 - 921
[33] Classifying protein kinase structures guides use of ligand-selectivity profiles to predict inactive conformations: Structure of lck/imatinib complex
Jacobs, Marc D.
Caron, Paul R.
Hare, Brian J.
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2008, 70 (04) : 1451 - 1460
[34] Impact of protein and small molecule interactions on kinase conformations
Kugler, Valentina
Schwaighofer, Selina
Feichtner, Andreas
Enzler, Florian
Fleischmann, Jakob
Strich, Sophie
Schwarz, Sarah
Wilson, Rebecca
Tschaikner, Philipp
Troppmair, Jakob
Sexl, Veronika
Meier, Pascal
Kaserer, Teresa
Stefan, Eduard
ELIFE, 2024, 13
[35] Identifying knot types of polymer conformations by machine learning
Vandans, Olafs
Yang, Kaiyuan
Wu, Zhongtao
Dai, Liang
PHYSICAL REVIEW E, 2020, 101 (02)
[36] A hybrid machine learning model for classifying time series
Abdullah Elen
Emre Avuçlu
Neural Computing and Applications, 2022, 34 : 1219 - 1237
[37] Classifying Restatements: An Application of Machine Learning and Textual Analytics
Hayes, Louise
Boritz, J. Efrim
JOURNAL OF INFORMATION SYSTEMS, 2021, 35 (03) : 107 - 131
[38] A Machine Learning Approach for Classifying Road Accident Hotspots
Amorim, Brunna de Sousa Pereira
Firmino, Anderson Almeida
Baptista, Claudio de Souza
Braz, Geraldo
de Paiva, Anselmo Cardoso
de Almeida, Francisco Edeverton
ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2023, 12 (06)
[39] Classifying snapshots of the doped Hubbard model with machine learning
Bohrdt, Annabelle
Chiu, Christie S.
Jig, Geoffrey
Xu, Muqing
Greif, Daniel
Greiner, Markus
Demler, Eugene
Grusdt, Fabian
Knap, Michael
NATURE PHYSICS, 2019, 15 (09) : 921 - 924
[40] Classifying features of freeway crashes using machine learning
Najafi, Zahra
Sadeghi, Rasool
Arghami, Shirazeh
INTERNATIONAL JOURNAL OF CRASHWORTHINESS, 2022, 27 (06) : 1678 - 1686

← 1 2 3 4 5 →