De novo protein design by deep network hallucination

被引:294
|
作者
Anishchenko, Ivan [1 ,2 ]
Pellock, Samuel J. [1 ,2 ]
Chidyausiku, Tamuka M. [1 ,2 ]
Ramelot, Theresa A. [3 ,4 ]
Ovchinnikov, Sergey [5 ]
Hao, Jingzhou [3 ,4 ]
Bafna, Khushboo [3 ,4 ]
Norn, Christoffer [1 ,2 ]
Kang, Alex [1 ,2 ]
Bera, Asim K. [1 ,2 ]
DiMaio, Frank [1 ,2 ]
Carter, Lauren [1 ,2 ]
Chow, Cameron M. [1 ,2 ]
Montelione, Gaetano T. [3 ,4 ]
Baker, David [1 ,2 ,6 ]
机构
[1] Univ Washington, Dept Biochem, Seattle, WA 98195 USA
[2] Univ Washington, Inst Prot Design, Seattle, WA 98195 USA
[3] Rensselaer Polytech Inst, Dept Chem & Chem Biol, Troy, NY USA
[4] Rensselaer Polytech Inst, Ctr Biotechnol & Interdisciplinary Sci, Troy, NY USA
[5] Harvard Univ, John Harvard Distinguished Sci Fellowship Program, Cambridge, MA 02138 USA
[6] Univ Washington, Howard Hughes Med Inst, Seattle, WA 98195 USA
关键词
NMR STRUCTURE; SOFTWARE SUITE; PREDICTION; ALIGNMENT; ALGORITHM; FEATURES; SYSTEM;
D O I
10.1038/s41586-021-04184-w
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
There has been considerable recent progress in protein structure prediction using deep neural networks to predict inter-residue distances from amino acid sequences(1-3). Here we investigate whether the information captured by such networks is sufficiently rich to generate new folded proteins with sequences unrelated to those of the naturally occurring proteins used in training the models. We generate random amino acid sequences, and input them into the trRosetta structure prediction network to predict starting residue-residue distance maps, which, as expected, are quite featureless. We then carry out Monte Carlo sampling in amino acid sequence space, optimizing the contrast (Kullback-Leibler divergence) between the inter-residue distance distributions predicted by the network and background distributions averaged over all proteins. Optimization from different random starting points resulted in novel proteins spanning a wide range of sequences and predicted structures. We obtained synthetic genes encoding 129 of the network-'hallucinated' sequences, and expressed and purified the proteins in Escherichia coli; 27 of the proteins yielded monodisperse species with circular dichroism spectra consistent with the hallucinated structures. We determined the three-dimensional structures of three of the hallucinated proteins, two by X-ray crystallography and one by NMR, and these closely matched the hallucinated models. Thus, deep networks trained to predict native protein structures from their sequences can be inverted to design new proteins, and such networks and methods should contribute alongside traditional physics-based models to the de novo design of proteins with new functions.
引用
收藏
页码:547 / +
页数:19
相关论文
共 50 条
  • [1] De novo protein design by deep network hallucination
    Ivan Anishchenko
    Samuel J. Pellock
    Tamuka M. Chidyausiku
    Theresa A. Ramelot
    Sergey Ovchinnikov
    Jingzhou Hao
    Khushboo Bafna
    Christoffer Norn
    Alex Kang
    Asim K. Bera
    Frank DiMaio
    Lauren Carter
    Cameron M. Chow
    Gaetano T. Montelione
    David Baker
    Nature, 2021, 600 : 547 - 552
  • [2] Deep learning based hallucination of de novo protein assemblies accross the nanoscale
    Courbet, A.
    ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2022, 78 : E77 - E77
  • [3] Improving de novo protein binder design with deep learning
    Bennett, Nathaniel R.
    Coventry, Brian
    Goreshnik, Inna
    Huang, Buwei
    Allen, Aza
    Vafeados, Dionne
    Peng, Ying Po
    Dauparas, Justas
    Baek, Minkyung
    Stewart, Lance
    DiMaio, Frank
    De Munck, Steven
    Savvides, Savvas N.
    Baker, David
    NATURE COMMUNICATIONS, 2023, 14 (01)
  • [4] Improving de novo protein binder design with deep learning
    Nathaniel R. Bennett
    Brian Coventry
    Inna Goreshnik
    Buwei Huang
    Aza Allen
    Dionne Vafeados
    Ying Po Peng
    Justas Dauparas
    Minkyung Baek
    Lance Stewart
    Frank DiMaio
    Steven De Munck
    Savvas N. Savvides
    David Baker
    Nature Communications, 14
  • [5] De novo protein design
    O'Driscoll, Cath
    CHEMISTRY & INDUSTRY, 2020, 84 (03) : 8 - 8
  • [6] De novo protein design
    Degrado, William
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2019, 257
  • [7] De novo protein design
    Degrado, William
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2018, 255
  • [8] De novo protein design
    Koehl, P
    Levitt, M
    DYNAMICS, STRUCTURE AND FUNCTION OF BIOLOGICAL MACROMOLECULES, 2001, 315 : 57 - 75
  • [9] Protein de novo design
    Tuchscherer, G
    Dumy, P
    Mutter, M
    CHIMIA, 1996, 50 (12) : 644 - 648
  • [10] De novo protein design by inversion of the AlphaFold structure prediction network
    Goverde, Casper A.
    Wolf, Benedict
    Khakzad, Hamed
    Rosset, Stephane
    Correia, Bruno E.
    PROTEIN SCIENCE, 2023, 32 (06)