Comprehensive comparison of large-scale tissue expression datasets

被引:63
|
作者
Santos, Alberto [1 ]
Tsafou, Kalliopi [1 ]
Stolte, Christian [2 ]
Pletscher-Frankild, Sune [1 ]
O'Donoghue, Sean I. [2 ,3 ]
Jensen, Lars Juhl [1 ]
机构
[1] Univ Copenhagen, Fac Hlth & Med Sci, Novo Nordisk Fdn Ctr Prot Res, Copenhagen, Denmark
[2] CSIRO, Sydney, NSW, Australia
[3] Garvan Inst Med Res, Sydney, NSW, Australia
来源
PEERJ | 2015年 / 3卷
基金
美国国家卫生研究院;
关键词
Immunohistochemistry; RNA sequencing; Tissue expression; Mass spectrometry; Microarrays; Databases; Tissue-specificity; GENE-EXPRESSION; MASS-SPECTROMETRY; HOUSEKEEPING GENES; RNA-SEQ; ATLAS; SPECIFICITY; MICROARRAY; DATABASE; DRAFT;
D O I
10.7717/peerj.1054
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
For tissues to carry out their functions, they rely on the right proteins to be present. Several high-throughput technologies have been used to map out which proteins are expressed in which tissues; however, the data have not previously been systematically compared and integrated. We present a comprehensive evaluation of tissue expression data from a variety of experimental techniques and show that these agree surprisingly well with each other and with results from literature curation and text mining. We further found that most datasets support the assumed but not demonstrated distinction between tissue-specific and ubiquitous expression. By developing comparable confidence scores for all types of evidence, we show that it is possible to improve both quality and coverage by combining the datasets. To facilitate use and visualization of our work, we have developed the TISSUES resource (http://tissues.jensenlab.org), which makes all the scored and integrated data available through a single user-friendly web interface.
引用
收藏
页数:23
相关论文
共 50 条
  • [41] Workload-aware anonymization techniques for large-scale datasets
    LeFevre, Kristen
    DeWitt, David J.
    Ramakrishnan, Raghu
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2008, 33 (03):
  • [42] Are open set classification methods effective on large-scale datasets?
    Roady, Ryne
    Hayes, Tyler L.
    Kemker, Ronald
    Gonzales, Ayesha
    Kanan, Christopher
    PLOS ONE, 2020, 15 (09):
  • [43] Evaluating precipitation datasets for large-scale distributed hydrological modelling
    Mazzoleni, M.
    Brandimarte, L.
    Amaranto, A.
    JOURNAL OF HYDROLOGY, 2019, 578
  • [44] Phylogenomics: Is less more when using large-scale datasets?
    Pisani, Davide
    Rossi, Maria Eleonora
    Marletaz, Ferdinand
    Feuda, Roberto
    CURRENT BIOLOGY, 2022, 32 (24)
  • [45] CARs-Lands: An associative classifier for large-scale datasets
    Almasi, Mehrdad
    Abadeh, Mohammad Saniee
    PATTERN RECOGNITION, 2020, 100
  • [46] Efficient Processing of Recursive Joins on Large-Scale Datasets in Spark
    Thuong-Cang Phan
    Anh-Cang Phan
    Thi-To-Quyen Tran
    Ngoan-Thanh Trieu
    ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING (ICCSAMA 2019), 2020, 1121 : 391 - 402
  • [47] Multistage strategy for ground point filtering on large-scale datasets
    Paredes, Diego Teijeiro
    Lopez, Margarita Amor
    Bujan, Sandra
    Richter, Rico
    Doellner, Juergen
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (18): : 25974 - 26001
  • [48] Consistent Matrix: A Feature Selection Framework for Large-Scale Datasets
    Yang, Tian
    Li, Yuan-Jiang
    Qian, Yuhua
    Wang, Fei-Yue
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2023, 31 (11) : 4024 - 4038
  • [49] Learning From Noisy Large-Scale Datasets With Minimal Supervision
    Veit, Andreas
    Alldrin, Neil
    Chechik, Gal
    Krasin, Ivan
    Gupta, Abhinav
    Belongie, Serge
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6575 - 6583
  • [50] Mango: Exploratory Data Analysis for Large-Scale Sequencing Datasets
    Morrow, Alyssa Kramer
    He, George Zhixuan
    Nothaft, Frank Austin
    Tu, Eric Tongching
    Paschall, Justin
    Yosef, Nir
    Joseph, Anthony Douglas
    CELL SYSTEMS, 2019, 9 (06) : 609 - +