A foundation model of transcription across human cell types

被引:1
|
作者
Fu, Xi [1 ,2 ]
Mo, Shentong [3 ,4 ]
Buendia, Alejandro [1 ]
Laurent, Anouchka P. [5 ]
Shao, Anqi [6 ]
Alvarez-Torres, Maria del Mar [1 ]
Yu, Tianji [1 ]
Tan, Jimin [7 ]
Su, Jiayu [1 ]
Sagatelian, Romella [1 ]
Ferrando, Adolfo A. [5 ,8 ]
Ciccia, Alberto [9 ]
Lan, Yanyan [10 ,11 ]
Owens, David M. [6 ,12 ]
Palomero, Teresa [5 ,12 ]
Xing, Eric P. [3 ,4 ]
Rabadan, Raul [1 ,2 ]
机构
[1] Columbia Univ, Dept Syst Biol, Program Math Genom, New York, NY 10027 USA
[2] Columbia Univ, Dept Biomed Informat, New York, NY 10027 USA
[3] Mohamed Bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates
[4] Carnegie Mellon Univ, Dept Machine Learning, Pittsburgh, PA 15213 USA
[5] Columbia Univ, Inst Canc Genet, New York, NY USA
[6] Columbia Univ, Dept Dermatol, New York, NY USA
[7] NYU, Inst Syst Genet, Grossman Sch Med, New York, NY USA
[8] Regeneron, Regeneron Genet Ctr, Tarrytown, NY USA
[9] Columbia Univ, Dept Genet & Dev, New York, NY USA
[10] Tsinghua Univ, Inst AI Ind Res, Beijing, Peoples R China
[11] Tsinghua Univ, Beijing Frontier Res Ctr Biol Struct, Beijing, Peoples R China
[12] Columbia Univ, Dept Pathol & Cell Biol, New York, NY USA
关键词
GENE-EXPRESSION; TARGET GENES; PAX5; METHYLATION; CHROMATIN; DNA;
D O I
10.1038/s41586-024-08391-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Transcriptional regulation, which involves a complex interplay between regulatory sequences and proteins, directs all biological processes. Computational models of transcription lack generalizability to accurately extrapolate to unseen cell types and conditions. Here we introduce GET (general expression transformer), an interpretable foundation model designed to uncover regulatory grammars across 213 human fetal and adult cell types1,2. Relying exclusively on chromatin accessibility data and sequence information, GET achieves experimental-level accuracy in predicting gene expression even in previously unseen cell types3. GET also shows remarkable adaptability across new sequencing platforms and assays, enabling regulatory inference across a broad range of cell types and conditions, and uncovers universal and cell-type-specific transcription factor interaction networks. We evaluated its performance in prediction of regulatory activity, inference of regulatory elements and regulators, and identification of physical interactions between transcription factors and found that it outperforms current models4 in predicting lentivirus-based massively parallel reporter assay readout5,6. In fetal erythroblasts7, we identified distal (greater than 1 Mbp) regulatory regions that were missed by previous models, and, in B cells, we identified a lymphocyte-specific transcription factor-transcription factor interaction that explains the functional significance of a leukaemia risk predisposing germline mutation8, 9-10. In sum, we provide a generalizable and accurate model for transcription together with catalogues of gene regulation and transcription factor interactions, all with cell type specificity.
引用
收藏
页码:965 / 973
页数:28
相关论文
共 50 条
  • [41] Mapping gene transcription and neurocognition across human neocortex
    Hansen, Justine Y.
    Markello, Ross D.
    Vogel, Jacob W.
    Seidlitz, Jakob
    Bzdok, Danilo
    Misic, Bratislav
    NATURE HUMAN BEHAVIOUR, 2021, 5 (09) : 1240 - +
  • [42] Mapping gene transcription and neurocognition across human neocortex
    Justine Y. Hansen
    Ross D. Markello
    Jacob W. Vogel
    Jakob Seidlitz
    Danilo Bzdok
    Bratislav Misic
    Nature Human Behaviour, 2021, 5 : 1240 - 1250
  • [43] Single-cell analysis of transcription kinetics across the cell cycle
    Skinner, Samuel O.
    Xu, Heng
    Nagarkar-Jaiswal, Sonal
    Freire, Pablo R.
    Zwaka, Thomas P.
    Golding, Ido
    ELIFE, 2016, 5
  • [44] Cell elasticity with altered cytoskeletal architectures across multiple cell types
    Grady, Martha E.
    Composto, Russell J.
    Eckmann, David M.
    JOURNAL OF THE MECHANICAL BEHAVIOR OF BIOMEDICAL MATERIALS, 2016, 61 : 197 - 207
  • [45] DNase-seq predicts regions of rotational nucleosome stability across diverse human cell types
    Winter, Deborah R.
    Song, Lingyun
    Mukherjee, Sayan
    Furey, Terrence S.
    Crawford, Gregory E.
    GENOME RESEARCH, 2013, 23 (07) : 1118 - 1129
  • [46] Comparison of REST Cistromes across Human Cell Types Reveals Common and Context-Specific Functions
    Rockowitz, Shira
    Lien, Wen-Hui
    Pedrosa, Erika
    Wei, Gang
    Lin, Mingyan
    Zhao, Keji
    Lachman, Herbert M.
    Fuchs, Elaine
    Zheng, Deyou
    PLOS COMPUTATIONAL BIOLOGY, 2014, 10 (06)
  • [47] Characterization of 5-methylcytosine and 5-hydroxymethylcytosine in human placenta cell types across gestation
    Wilson, Rebecca L.
    Francois, Maxime
    Jankovic-Karasoulos, Tanja
    McAninch, Dale
    McCullough, Dylan
    Leifert, Wayne R.
    Roberts, Claire T.
    Bianco-Miotto, Tina
    EPIGENETICS, 2019, 14 (07) : 660 - 671
  • [48] Chromatin modifications and genomic contexts linked to dynamic DNA methylation patterns across human cell types
    Haidan Yan
    Dongwei Zhang
    Hongbo Liu
    Yanjun Wei
    Jie Lv
    Fang Wang
    Chunlong Zhang
    Qiong Wu
    Jianzhong Su
    Yan Zhang
    Scientific Reports, 5
  • [49] Genome-wide analysis of differential transcriptional and epigenetic variability across human immune cell types
    Ecker, Simone
    Chen, Lu
    Pancaldi, Vera
    Bagger, Frederik O.
    Maria Fernandez, Jose
    de Santa Pau, Enrique Carrillo
    Juan, David
    Mann, Alice L.
    Watt, Stephen
    Casale, Francesco Paolo
    Sidiropoulos, Nikos
    Rapin, Nicolas
    Merkel, Angelika
    Stunnenberg, Hendrik G.
    Stegle, Oliver
    Frontini, Mattia
    Downes, Kate
    Pastinen, Tomi
    Kuijpers, Taco W.
    Rico, Daniel
    Valencia, Alfonso
    Beck, Stephan
    Soranzo, Nicole
    Paul, Dirk S.
    GENOME BIOLOGY, 2017, 18
  • [50] Chromatin modifications and genomic contexts linked to dynamic DNA methylation patterns across human cell types
    Yan, Haidan
    Zhang, Dongwei
    Liu, Hongbo
    Wei, Yanjun
    Lv, Jie
    Wang, Fang
    Zhang, Chunlong
    Wu, Qiong
    Su, Jianzhong
    Zhang, Yan
    SCIENTIFIC REPORTS, 2015, 5 : 8410