NCBI GEO: archive for gene expression and epigenomics data sets: 23-year update

被引:81
|
作者
Clough, Emily [1 ]
Barrett, Tanya [1 ]
Wilhite, Stephen E. [1 ]
Ledoux, Pierre [1 ]
Evangelista, Carlos [1 ]
Kim, Irene F. [1 ]
Tomashevsky, Maxim [1 ]
Marshall, Kimberly A. [1 ]
Phillippy, Katherine H. [1 ]
Sherman, Patti M. [1 ]
Lee, Hyeseung [1 ]
Zhang, Naigong [1 ]
Serova, Nadezhda [1 ]
Wagner, Lukas [1 ]
Zalunin, Vadim [1 ]
Kochergin, Andrey [1 ]
Soboleva, Alexandra [1 ]
机构
[1] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20892 USA
关键词
RNA-SEQ; OMNIBUS; PRINCIPLES; CHROMATIN; MAPS;
D O I
10.1093/nar/gkad965
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The Gene Expression Omnibus (GEO) is an international public repository that archives gene expression and epigenomics data sets generated by next-generation sequencing and microarray technologies. Data are typically submitted to GEO by researchers in compliance with widespread journal and funder mandates to make generated data publicly accessible. The resource handles raw data files, processed data files and descriptive metadata for over 200 000 studies and 6.5 million samples, all of which are indexed, searchable and downloadable. Additionally, GEO offers web-based tools that facilitate analysis and visualization of differential gene expression. This article presents the current status and recent advancements in GEO, including the generation of consistently computed gene expression count matrices for thousands of RNA-seq studies, and new interactive graphical plots in GEO2R that help users identify differentially expressed genes and assess data set quality. The GEO repository is built and maintained by the National Center for Biotechnology Information (NCBI), a division of the National Library of Medicine (NLM), and is publicly accessible at https://www.ncbi.nlm.nih.gov/geo/. Graphical Abstract
引用
收藏
页码:D138 / D144
页数:7
相关论文
共 50 条
  • [31] Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data
    Tintle, Nathan L.
    Sitarik, Alexandra
    Boerema, Benjamin
    Young, Kylie
    Best, Aaron A.
    DeJongh, Matthew
    BMC BIOINFORMATICS, 2012, 13
  • [32] Identification of Characteristic Immune Profiles from Gene Expression Data in Renal Cell Carcinoma Gene Expression TCGA Data Sets
    Volyanskyy, Konstantin
    Zhong, Minghao
    Lucito, Robert
    Fanucchi, Michael
    Fallon, John T.
    Dimitrova, Nevenka
    MODERN PATHOLOGY, 2018, 31 : 398 - 398
  • [33] Identification of Characteristic Immune Profiles from Gene Expression Data in Renal Cell Carcinoma Gene Expression TCGA Data Sets
    Volyanskyy, Konstantin
    Zhong, Minghao
    Lucito, Robert
    Fanucchi, Michael
    Fallon, John T.
    Dimitrova, Nevenka
    LABORATORY INVESTIGATION, 2018, 98 : 398 - 398
  • [34] Random forests-based differential analysis of gene sets for gene expression data
    Hsueh, Huey-Miin
    Zhou, Da-Wei
    Tsai, Chen-An
    GENE, 2013, 518 (01) : 179 - 186
  • [35] Hot papers - The rise of free, global gene expression data sets
    Kling, J
    SCIENTIST, 2002, 16 (07): : 34 - 35
  • [36] Identification of Susceptibility Genes to Allergic Rhinitis by Gene Expression Data Sets
    Xue, Kai
    Yang, Jingpu
    Zhao, Yin
    Cheng, Jinzhang
    Wang, Zonggui
    CTS-CLINICAL AND TRANSLATIONAL SCIENCE, 2020, 13 (01): : 169 - 178
  • [37] Inclusion of Textual Documentation in the Analysis of Multidimensional Data Sets: Application to Gene Expression Data
    Soumya Raychaudhuri
    Hinrich Schütze
    Russ B. Altman
    Machine Learning, 2003, 52 : 119 - 145
  • [38] Inclusion of textual documentation in the analysis of multidimensional data sets:: Application to gene expression data
    Raychaudhuri, S
    Schütze, H
    Altman, RB
    MACHINE LEARNING, 2003, 52 (1-2) : 119 - 145
  • [39] Can Survival Prediction Be Improved By Merging Gene Expression Data Sets?
    Yasrebi, Haleh
    Sperisen, Peter
    Praz, Viviane
    Bucher, Philipp
    PLOS ONE, 2009, 4 (10):
  • [40] Exact biclustering algorithm for the analysis of large gene expression data sets
    Oliver Voggenreiter
    Stefan Bleuler
    Wilhelm Gruissem
    BMC Bioinformatics, 13 (Suppl 18)