Hold out the genome: a roadmap to solving the cis-regulatory code

被引:22
|
作者
de Boer, Carl G. [1 ]
Taipale, Jussi [2 ,3 ,4 ]
机构
[1] Univ British Columbia, Sch Biomed Engn, Vancouver, BC, Canada
[2] Univ Helsinki, Fac Med, Appl Tumor Genom Res Program, Helsinki, Finland
[3] Karolinska Inst, Dept Med Biochem & Biophys, Stockholm, Sweden
[4] Univ Cambridge, Dept Biochem, Cambridge, England
关键词
ENHANCER ACTIVITY MAPS; TRANSCRIPTION FACTORS; SHADOW ENHANCERS; GENE; SEQUENCE; BINDING; EVOLUTION; EXPRESSION; ELEMENTS; MODEL;
D O I
10.1038/s41586-023-06661-w
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Gene expression is regulated by transcription factors that work together to read cis-regulatory DNA sequences. The 'cis-regulatory code' - how cells interpret DNA sequences to determine when, where and how much genes should be expressed - has proven to be exceedingly complex. Recently, advances in the scale and resolution of functional genomics assays and machine learning have enabled substantial progress towards deciphering this code. However, the cis-regulatory code will probably never be solved if models are trained only on genomic sequences; regions of homology can easily lead to overestimation of predictive performance, and our genome is too short and has insufficient sequence diversity to learn all relevant parameters. Fortunately, randomly synthesized DNA sequences enable testing a far larger sequence space than exists in our genomes, and designed DNA sequences enable targeted queries to maximally improve the models. As the same biochemical principles are used to interpret DNA regardless of its source, models trained on these synthetic data can predict genomic activity, often better than genome-trained models. Here we provide an outlook on the field, and propose a roadmap towards solving the cis-regulatory code by a combination of machine learning and massively parallel assays using synthetic DNA.
引用
收藏
页码:41 / 50
页数:10
相关论文
共 50 条
  • [1] Hold out the genome: a roadmap to solving the cis-regulatory code
    Carl G. de Boer
    Jussi Taipale
    Nature, 2024, 625 : 41 - 50
  • [2] Hold out the genome: a roadmap to solving the cis-regulatory code
    de Boer, Carl G.
    Taipale, Jussi
    NATURE, 2024, 625 (7993) : 41 - 50
  • [3] Deciphering the transcriptional cis-regulatory code
    Yanez-Cuna, J. Omar
    Kvon, Evgeny Z.
    Stark, Alexander
    TRENDS IN GENETICS, 2013, 29 (01) : 11 - 22
  • [4] A cis-regulatory map of the Drosophila genome
    Negre, Nicolas
    Brown, Christopher D.
    Ma, Lijia
    Bristow, Christopher Aaron
    Miller, Steven W.
    Wagner, Ulrich
    Kheradpour, Pouya
    Eaton, Matthew L.
    Loriaux, Paul
    Sealfon, Rachel
    Li, Zirong
    Ishii, Haruhiko
    Spokony, Rebecca F.
    Chen, Jia
    Hwang, Lindsay
    Cheng, Chao
    Auburn, Richard P.
    Davis, Melissa B.
    Domanus, Marc
    Shah, Parantu K.
    Morrison, Carolyn A.
    Zieba, Jennifer
    Suchy, Sarah
    Senderowicz, Lionel
    Victorsen, Alec
    Bild, Nicholas A.
    Grundstad, A. Jason
    Hanley, David
    MacAlpine, David M.
    Mannervik, Mattias
    Venken, Koen
    Bellen, Hugo
    White, Robert
    Gerstein, Mark
    Russell, Steven
    Grossman, Robert L.
    Ren, Bing
    Posakony, James W.
    Kellis, Manolis
    White, Kevin P.
    NATURE, 2011, 471 (7339) : 527 - 531
  • [5] A cis-regulatory map of the Drosophila genome
    Nicolas Nègre
    Christopher D. Brown
    Lijia Ma
    Christopher Aaron Bristow
    Steven W. Miller
    Ulrich Wagner
    Pouya Kheradpour
    Matthew L. Eaton
    Paul Loriaux
    Rachel Sealfon
    Zirong Li
    Haruhiko Ishii
    Rebecca F. Spokony
    Jia Chen
    Lindsay Hwang
    Chao Cheng
    Richard P. Auburn
    Melissa B. Davis
    Marc Domanus
    Parantu K. Shah
    Carolyn A. Morrison
    Jennifer Zieba
    Sarah Suchy
    Lionel Senderowicz
    Alec Victorsen
    Nicholas A. Bild
    A. Jason Grundstad
    David Hanley
    David M. MacAlpine
    Mattias Mannervik
    Koen Venken
    Hugo Bellen
    Robert White
    Mark Gerstein
    Steven Russell
    Robert L. Grossman
    Bing Ren
    James W. Posakony
    Manolis Kellis
    Kevin P. White
    Nature, 2011, 471 : 527 - 531
  • [6] Logic functions of the genomic cis-regulatory code
    Davidson, E
    Istrail, S
    UNCONVENTIONAL COMPUTATION, PROCEEDINGS, 2005, 3699 : 19 - 19
  • [7] Logic functions of the genomic cis-regulatory code
    Istrail, S
    Davidson, EH
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (14) : 4954 - 4959
  • [8] The cis-regulatory code of Hox function in Drosophila
    Sorge, Sebastian
    Ha, Nati
    Polychronidou, Maria
    Friedrich, Jana
    Bezdan, Daniela
    Kaspar, Petra
    Schaefer, Martin H.
    Ossowski, Stephan
    Henz, Stefan R.
    Mundorf, Juliane
    Raetzer, Jenny
    Papagiannouli, Fani
    Lohmann, Ingrid
    EMBO JOURNAL, 2012, 31 (15): : 3323 - 3333
  • [9] A map of the cis-regulatory sequences in the mouse genome
    Shen, Yin
    Yue, Feng
    McCleary, David F.
    Ye, Zhen
    Edsall, Lee
    Kuan, Samantha
    Wagner, Ulrich
    Dixon, Jesse
    Lee, Leonard
    Lobanenkov, Victor V.
    Ren, Bing
    NATURE, 2012, 488 (7409) : 116 - 120
  • [10] A map of the cis-regulatory sequences in the mouse genome
    Yin Shen
    Feng Yue
    David F. McCleary
    Zhen Ye
    Lee Edsall
    Samantha Kuan
    Ulrich Wagner
    Jesse Dixon
    Leonard Lee
    Victor V. Lobanenkov
    Bing Ren
    Nature, 2012, 488 : 116 - 120