Accurate quantification of transcriptome from RNA-Seq data by effective length normalization

被引：77

作者：

Lee, Soohyun ^{[1
]}

Seo, Chae Hwa ^{[1
]}

Lim, Byungho ^{[2
]}

Yang, Jin Ok ^{[1
]}

Oh, Jeongsu ^{[1
]}

Kim, Minjin ^{[2
]}

Lee, Sooncheol ^{[2
]}

Lee, Byungwook ^{[1
]}

Kang, Changwon ^{[2
]}

Lee, Sanghyuk ^{[1
,3
]}

机构：

[1] KRIBB, Korean Bioinformat Ctr KOBIC, 111 Gwahangno, Taejon 305806, South Korea

[2] Korea Adv Inst Sci & Technol, Dept Biol Sci, Taejon 305701, South Korea

[3] Ewha Womans Univ, Div Life & Pharmaceut Sci, ERCSB, Seoul 120750, South Korea

来源：

NUCLEIC ACIDS RESEARCH | 2011年 / 39卷 / 02期

关键词：

EXPRESSION;

D O I：

10.1093/nar/gkq1015

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

We propose a novel, efficient and intuitive approach of estimating mRNA abundances from the whole transcriptome shotgun sequencing (RNA-Seq) data. Our method, NEUMA (Normalization by Expected Uniquely Mappable Area), is based on effective length normalization using uniquely mappable areas of gene and mRNA isoform models. Using the known transcriptome sequence model such as RefSeq, NEUMA pre-computes the numbers of all possible gene-wise and isoform-wise informative reads: the former being sequences mapped to all mRNA isoforms of a single gene exclusively and the latter uniquely mapped to a single mRNA isoform. The results are used to estimate the effective length of genes and transcripts, taking experimental distributions of fragment size into consideration. Quantitative RT-PCR based on 27 randomly selected genes in two human cell lines and computer simulation experiments demonstrated superior accuracy of NEUMA over other recently developed methods. NEUMA covers a large proportion of genes and mRNA isoforms and offers a measure of consistency ('consistency coefficient') for each gene between an independently measured gene-wise level and the sum of the isoform levels. NEUMA is applicable to both paired-end and single-end RNA-Seq data. We propose that NEUMA could make a standard method in quantifying gene transcript levels from RNA-Seq data.

引用

页数：10

共 50 条

[41] Quantification of co-transcriptional splicing from RNA-Seq data
Herzel, Lydia
Neugebauer, Karla M.
METHODS, 2015, 85 : 36 - 43
[42] Principles of transcriptome analysis and gene expression quantification: an RNA-seq tutorial
Wolf, Jochen B. W.
MOLECULAR ECOLOGY RESOURCES, 2013, 13 (04) : 559 - 572
[43] A Robust Method for Transcript Quantification with RNA-Seq Data
Huang, Yan
Hu, Yin
Jones, Corbin D.
MacLeod, James N.
Chiang, Derek Y.
Liu, Yufeng
Prins, Jan F.
Liu, Jinze
JOURNAL OF COMPUTATIONAL BIOLOGY, 2013, 20 (03) : 167 - 187
[44] BADGE: A novel Bayesian model for accurate abundance quantification and differential analysis of RNA-Seq data
Gu, Jinghua
Wang, Xiao
Halakivi-Clarke, Leena
Clarke, Robert
Xuan, Jianhua
BMC BIOINFORMATICS, 2014, 15
[45] Efficient RNA isoform identification and quantification from RNA-Seq data with network flows
Bernard, Elsa
Jacob, Laurent
Mairal, Julien
Vert, Jean-Philippe
BIOINFORMATICS, 2014, 30 (17) : 2447 - 2455
[46] BADGE: A novel Bayesian model for accurate abundance quantification and differential analysis of RNA-Seq data
Jinghua Gu
Xiao Wang
Leena Halakivi-Clarke
Robert Clarke
Jianhua Xuan
BMC Bioinformatics, 15
[47] Fast RNA-seq quantification
Nature Methods, 2016, 13 (6) : 470 - 470
[48] Escherichia coli transcriptome assembly from a compendium of RNA-seq data sets
Tjaden, Brian
RNA BIOLOGY, 2023, 20 (01) : 77 - 84
[49] Ancestral transcriptome inference based on RNA-Seq and ChIP-seq data
Yang, Jingwen
Ruan, Hang
Zou, Yangyun
Su, Zhixi
Gu, Xun
METHODS, 2020, 176 : 99 - 105
[50] BCseq: accurate single cell RNA-seq quantification with bias correction
Chen, Liang
Zheng, Sika
NUCLEIC ACIDS RESEARCH, 2018, 46 (14)

← 1 2 3 4 5 →