An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era

被引:114
|
作者
Su, Zhenqiang [1 ,2 ]
Fang, Hong [1 ]
Hong, Huixiao [1 ]
Shi, Leming [3 ,4 ,5 ,6 ]
Zhang, Wenqian [1 ]
Zhang, Wenwei [6 ,7 ]
Zhang, Yanyan [7 ]
Dong, Zirui [7 ,8 ]
Lancashire, Lee J. [3 ]
Bessarabova, Marina [2 ]
Yang, Xi [1 ]
Ning, Baitang [1 ]
Gong, Binsheng [1 ]
Meehan, Joe [1 ]
Xu, Joshua [1 ]
Ge, Weigong [1 ]
Perkins, Roger [1 ]
Fischer, Matthias [8 ,9 ]
Tong, Weida [1 ]
机构
[1] US FDA, Natl Ctr Toxicol Res, Jefferson, AR 72079 USA
[2] Thomson Reuters, IP & Sci, Boston, MA 02210 USA
[3] Fudan Univ, Sch Life Sci & Pharm, State Key Lab Genet Engn, Shanghai 201203, Peoples R China
[4] Fudan Univ, Sch Life Sci & Pharm, MOE Key Lab Contemporary Anthropol, Shanghai 201203, Peoples R China
[5] Fudan Zhangjiang Ctr Clin Genom, Shanghai 201203, Peoples R China
[6] Zhanjiang Ctr Translat Med, Shanghai 201203, Peoples R China
[7] BGI Shenzhen, Guangdong 518083, Peoples R China
[8] Univ Childrens Hosp Cologne, Dept Pediat Oncol & Hematol, D-50924 Cologne, Germany
[9] Univ Childrens Hosp Cologne, Ctr Mol Med CMMC, D-50924 Cologne, Germany
来源
GENOME BIOLOGY | 2014年 / 15卷 / 12期
基金
美国国家科学基金会; 国家高技术研究发展计划(863计划);
关键词
GENE-EXPRESSION SIGNATURE; REPRODUCIBILITY;
D O I
10.1186/s13059-014-0523-y
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Gene expression microarray has been the primary biomarker platform ubiquitously applied in biomedical research, resulting in enormous data, predictive models, and biomarkers accrued. Recently, RNA-seq has looked likely to replace microarrays, but there will be a period where both technologies co-exist. This raises two important questions: Can microarray-based models and biomarkers be directly applied to RNA-seq data? Can future RNA-seq-based predictive models and biomarkers be applied to microarray data to leverage past investment? Results: We systematically evaluated the transferability of predictive models and signature genes between microarray and RNA-seq using two large clinical data sets. The complexity of cross-platform sequence correspondence was considered in the analysis and examined using three human and two rat data sets, and three levels of mapping complexity were revealed. Three algorithms representing different modeling complexity were applied to the three levels of mappings for each of the eight binary endpoints and Cox regression was used to model survival times with expression data. In total, 240,096 predictive models were examined. Conclusions: Signature genes of predictive models are reciprocally transferable between microarray and RNA-seq data for model development, and microarray-based models can accurately predict RNA-seq-profiled samples; while RNA-seq-based models are less accurate in predicting microarray-profiled samples and are affected both by the choice of modeling algorithm and the gene mapping complexity. The results suggest continued usefulness of legacy microarray data and established microarray biomarkers and predictive models in the forthcoming RNA-seq era.
引用
收藏
页数:26
相关论文
共 50 条
  • [21] The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance
    Wang, Charles
    Gong, Binsheng
    Bushel, Pierre R.
    Thierry-Mieg, Jean
    Thierry-Mieg, Danielle
    Xu, Joshua
    Fang, Hong
    Hong, Huixiao
    Shen, Jie
    Su, Zhenqiang
    Meehan, Joe
    Li, Xiaojin
    Yang, Lu
    Li, Haiqing
    Labaj, Pawel P.
    Kreil, David P.
    Megherbi, Dalila
    Gaj, Stan
    Caiment, Florian
    van Delft, Joost
    Kleinjans, Jos
    Scherer, Andreas
    Devanarayan, Viswanath
    Wang, Jian
    Yang, Yong
    Qian, Hui-Rong
    Lancashire, Lee J.
    Bessarabova, Marina
    Nikolsky, Yuri
    Furlanello, Cesare
    Chierici, Marco
    Albanese, Davide
    Jurman, Giuseppe
    Riccadonna, Samantha
    Filosi, Michele
    Visintainer, Roberto
    Zhang, Ke K.
    Li, Jainying
    Hsieh, Jui-Hua
    Svoboda, Daniel L.
    Fuscoe, James C.
    Deng, Youping
    Shi, Leming
    Paules, Richard S.
    Auerbach, Scott S.
    Tong, Weida
    NATURE BIOTECHNOLOGY, 2014, 32 (09) : 926 - 932
  • [22] The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance
    Charles Wang
    Binsheng Gong
    Pierre R Bushel
    Jean Thierry-Mieg
    Danielle Thierry-Mieg
    Joshua Xu
    Hong Fang
    Huixiao Hong
    Jie Shen
    Zhenqiang Su
    Joe Meehan
    Xiaojin Li
    Lu Yang
    Haiqing Li
    Paweł P Łabaj
    David P Kreil
    Dalila Megherbi
    Stan Gaj
    Florian Caiment
    Joost van Delft
    Jos Kleinjans
    Andreas Scherer
    Viswanath Devanarayan
    Jian Wang
    Yong Yang
    Hui-Rong Qian
    Lee J Lancashire
    Marina Bessarabova
    Yuri Nikolsky
    Cesare Furlanello
    Marco Chierici
    Davide Albanese
    Giuseppe Jurman
    Samantha Riccadonna
    Michele Filosi
    Roberto Visintainer
    Ke K Zhang
    Jianying Li
    Jui-Hua Hsieh
    Daniel L Svoboda
    James C Fuscoe
    Youping Deng
    Leming Shi
    Richard S Paules
    Scott S Auerbach
    Weida Tong
    Nature Biotechnology, 2014, 32 : 926 - 932
  • [23] Prediction of Genetic Biomarkers from RNA-Seq Dataset of Colon Cancer
    Adeyemi, Tijesunimi
    Ezekiel, Deborah
    Diaz, Sergio
    Sabb, Felix
    Abdul, Abdullah
    Nembhard, Fitzroy
    Paudel, Roshan
    2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 1378 - 1385
  • [24] Dimensionality Reduction of RNA-Seq Data
    Al-Turaiki, Isra
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (03): : 31 - 36
  • [25] Transcript quantification with RNA-Seq data
    Bohnert, Regina
    Behr, Jonas
    Raetsch, Gunnar
    BMC BIOINFORMATICS, 2009, 10 : P5
  • [26] Statistical Modeling of RNA-Seq Data
    Salzman, Julia
    Jiang, Hui
    Wong, Wing Hung
    STATISTICAL SCIENCE, 2011, 26 (01) : 62 - 83
  • [27] Analysis of clustered RNA-seq data
    Park, Hyunjin
    Lee, Seungyeoun
    Kim, Ye Jin
    Choi, Myung-Sook
    Park, Taesung
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2017, 19 (01) : 19 - 31
  • [28] Feasibility and utility of the custom RNA-seq for the identification of therapeutic targets and diagnostic biomarkers in glioma.
    Takahashi, Masamichi
    Kohsaka, Shinji
    Ichimura, Koichi
    Mano, Hiroyuki
    Narita, Yoshitaka
    JOURNAL OF CLINICAL ONCOLOGY, 2023, 41 (16)
  • [29] Transcript quantification with RNA-Seq data
    Regina Bohnert
    Jonas Behr
    Gunnar Rätsch
    BMC Bioinformatics, 10
  • [30] RNA-Seq Data: A Complexity Journey
    Capobianco, Enrico
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2014, 11 (19): : 123 - 130