Protein structure prediction using deep learning distance and hydrogen-bonding restraints in CASP14

被引:42
|
作者
Zheng, Wei [1 ]
Li, Yang [1 ,2 ]
Zhang, Chengxin [1 ]
Zhou, Xiaogen [1 ]
Pearce, Robin [1 ]
Bell, Eric W. [1 ]
Huang, Xiaoqiang [1 ]
Zhang, Yang [1 ,3 ]
机构
[1] Univ Michigan, Dept Computat Med & Bioinformat, Ann Arbor, MI 48109 USA
[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing, Peoples R China
[3] Univ Michigan, Dept Biol Chem, Ann Arbor, MI 48109 USA
基金
美国国家科学基金会;
关键词
ab initio folding; CASP14; deep learning; domain partition; multiple sequence alignment; protein structure prediction; residue-residue distance prediction; FOLD-RECOGNITION; I-TASSER; SIMILARITY; SERVER;
D O I
10.1002/prot.26193
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In this article, we report 3D structure prediction results by two of our best server groups ("Zhang-Server" and "QUARK") in CASP14. These two servers were built based on the D-I-TASSER and D-QUARK algorithms, which integrated four newly developed components into the classical protein folding pipelines, I-TASSER and QUARK, respectively. The new components include: (a) a new multiple sequence alignment (MSA) collection tool, DeepMSA2, which is extended from the DeepMSA program; (b) a contact-based domain boundary prediction algorithm, FUpred, to detect protein domain boundaries; (c) a residual convolutional neural network-based method, DeepPotential, to predict multiple spatial restraints by co-evolutionary features derived from the MSA; and (d) optimized spatial restraint energy potentials to guide the structure assembly simulations. For 37 FM targets, the average TM-scores of the first models produced by D-I-TASSER and D-QUARK were 96% and 112% higher than those constructed by I-TASSER and QUARK, respectively. The data analysis indicates noticeable improvements produced by each of the four new components, especially for the newly added spatial restraints from DeepPotential and the well-tuned force field that combines spatial restraints, threading templates, and generic knowledge-based potentials. However, challenges still exist in the current pipelines. These include difficulties in modeling multi-domain proteins due to low accuracy in inter-domain distance prediction and modeling protein domains from oligomer complexes, as the co-evolutionary analysis cannot distinguish inter-chain and intra-chain distances. Specifically tuning the deep learning-based predictors for multi-domain targets and protein complexes may be helpful to address these issues.
引用
收藏
页码:1734 / 1751
页数:18
相关论文
共 50 条
  • [31] Improved protein structure prediction using potentials from deep learning
    Andrew W. Senior
    Richard Evans
    John Jumper
    James Kirkpatrick
    Laurent Sifre
    Tim Green
    Chongli Qin
    Augustin Žídek
    Alexander W. R. Nelson
    Alex Bridgland
    Hugo Penedones
    Stig Petersen
    Karen Simonyan
    Steve Crossan
    Pushmeet Kohli
    David T. Jones
    David Silver
    Koray Kavukcuoglu
    Demis Hassabis
    Nature, 2020, 577 : 706 - 710
  • [32] Improved protein structure prediction using potentials from deep learning
    Senior, Andrew W.
    Evans, Richard
    Jumper, John
    Kirkpatrick, James
    Sifre, Laurent
    Green, Tim
    Qin, Chongli
    Zidek, Augustin
    Nelson, Alexander W. R.
    Bridgland, Alex
    Penedones, Hugo
    Petersen, Stig
    Simonyan, Karen
    Crossan, Steve
    Kohli, Pushmeet
    Jones, David T.
    Silver, David
    Kavukcuoglu, Koray
    Hassabis, Demis
    NATURE, 2020, 577 (7792) : 706 - +
  • [33] Automated structure modeling of large protein assemblies using crosslinks as distance restraints
    Ferber, Mathias
    Kosinski, Jan
    Ori, Alessandro
    Rashid, Umar J.
    Moreno-Morcillo, Maria
    Simon, Bernd
    Bouvier, Guillaume
    Batista, Paulo Ricardo
    Mueller, Christoph W.
    Beck, Martin
    Nilges, Michael
    NATURE METHODS, 2016, 13 (06) : 515 - +
  • [34] Protein structure prediction in the deep learning era
    Peng, Zhenling
    Wang, Wenkai
    Han, Renmin
    Zhang, Fa
    Yang, Jianyi
    CURRENT OPINION IN STRUCTURAL BIOLOGY, 2022, 77
  • [35] Automated structure modeling of large protein assemblies using crosslinks as distance restraints
    Ferber M.
    Kosinski J.
    Ori A.
    Rashid U.J.
    Moreno-Morcillo M.
    Simon B.
    Bouvier G.
    Batista P.R.
    Muller C.W.
    Beck M.
    Nilges M.
    Nature Methods, 2016, 13 (6) : 515 - 520
  • [37] Deep learning methods in protein structure prediction
    Torrisi, Mirko
    Pollastri, Gianluca
    Le, Quan
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2020, 18 : 1301 - 1310
  • [38] Improving protein structure prediction with extended sequence similarity searches and deep-learning-based refinement in CASP15
    Oda, Toshiyuki
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2023, 91 (12) : 1712 - 1723
  • [39] Cooperative effects in hydrogen-bonding of protein secondary structure elements: A systematic analysis of crystal data using Secbase
    Koch, O
    Bocola, M
    Klebe, G
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 61 (02) : 310 - 317
  • [40] Protein Structure Prediction in CASP13 Using AWSEM-Suite
    Jin, Shikai
    Chen, Mingchen
    Chen, Xun
    Bueno, Carlos
    Lu, Wei
    Schafer, Nicholas P.
    Lin, Xingcheng
    Onuchic, Jose N.
    Wolynes, Peter G.
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2020, 16 (06) : 3977 - 3988