Comprehensive comparison of large-scale tissue expression datasets

被引：63

作者：

Santos, Alberto ^{[1
]}

Tsafou, Kalliopi ^{[1
]}

Stolte, Christian ^{[2
]}

Pletscher-Frankild, Sune ^{[1
]}

O'Donoghue, Sean I. ^{[2
,3
]}

Jensen, Lars Juhl ^{[1
]}

机构：

[1] Univ Copenhagen, Fac Hlth & Med Sci, Novo Nordisk Fdn Ctr Prot Res, Copenhagen, Denmark

[2] CSIRO, Sydney, NSW, Australia

[3] Garvan Inst Med Res, Sydney, NSW, Australia

来源：

PEERJ | 2015年 / 3卷

基金：

美国国家卫生研究院;

关键词：

Immunohistochemistry; RNA sequencing; Tissue expression; Mass spectrometry; Microarrays; Databases; Tissue-specificity; GENE-EXPRESSION; MASS-SPECTROMETRY; HOUSEKEEPING GENES; RNA-SEQ; ATLAS; SPECIFICITY; MICROARRAY; DATABASE; DRAFT;

D O I：

10.7717/peerj.1054

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

For tissues to carry out their functions, they rely on the right proteins to be present. Several high-throughput technologies have been used to map out which proteins are expressed in which tissues; however, the data have not previously been systematically compared and integrated. We present a comprehensive evaluation of tissue expression data from a variety of experimental techniques and show that these agree surprisingly well with each other and with results from literature curation and text mining. We further found that most datasets support the assumed but not demonstrated distinction between tissue-specific and ubiquitous expression. By developing comparable confidence scores for all types of evidence, we show that it is possible to improve both quality and coverage by combining the datasets. To facilitate use and visualization of our work, we have developed the TISSUES resource (http://tissues.jensenlab.org), which makes all the scored and integrated data available through a single user-friendly web interface.

引用

页数：23

共 50 条

[41] Workload-aware anonymization techniques for large-scale datasets
LeFevre, Kristen
DeWitt, David J.
Ramakrishnan, Raghu
ACM TRANSACTIONS ON DATABASE SYSTEMS, 2008, 33 (03):
[42] Are open set classification methods effective on large-scale datasets?
Roady, Ryne
Hayes, Tyler L.
Kemker, Ronald
Gonzales, Ayesha
Kanan, Christopher
PLOS ONE, 2020, 15 (09):
[43] Evaluating precipitation datasets for large-scale distributed hydrological modelling
Mazzoleni, M.
Brandimarte, L.
Amaranto, A.
JOURNAL OF HYDROLOGY, 2019, 578
[44] Phylogenomics: Is less more when using large-scale datasets?
Pisani, Davide
Rossi, Maria Eleonora
Marletaz, Ferdinand
Feuda, Roberto
CURRENT BIOLOGY, 2022, 32 (24)
[45] CARs-Lands: An associative classifier for large-scale datasets
Almasi, Mehrdad
Abadeh, Mohammad Saniee
PATTERN RECOGNITION, 2020, 100
[46] Efficient Processing of Recursive Joins on Large-Scale Datasets in Spark
Thuong-Cang Phan
Anh-Cang Phan
Thi-To-Quyen Tran
Ngoan-Thanh Trieu
ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING (ICCSAMA 2019), 2020, 1121 : 391 - 402
[47] Multistage strategy for ground point filtering on large-scale datasets
Paredes, Diego Teijeiro
Lopez, Margarita Amor
Bujan, Sandra
Richter, Rico
Doellner, Juergen
JOURNAL OF SUPERCOMPUTING, 2024, 80 (18): : 25974 - 26001
[48] Consistent Matrix: A Feature Selection Framework for Large-Scale Datasets
Yang, Tian
Li, Yuan-Jiang
Qian, Yuhua
Wang, Fei-Yue
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2023, 31 (11) : 4024 - 4038
[49] Learning From Noisy Large-Scale Datasets With Minimal Supervision
Veit, Andreas
Alldrin, Neil
Chechik, Gal
Krasin, Ivan
Gupta, Abhinav
Belongie, Serge
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6575 - 6583
[50] Mango: Exploratory Data Analysis for Large-Scale Sequencing Datasets
Morrow, Alyssa Kramer
He, George Zhixuan
Nothaft, Frank Austin
Tu, Eric Tongching
Paschall, Justin
Yosef, Nir
Joseph, Anthony Douglas
CELL SYSTEMS, 2019, 9 (06) : 609 - +

← 1 2 3 4 5 →