Accurate quantification of transcriptome from RNA-Seq data by effective length normalization

被引:77
|
作者
Lee, Soohyun [1 ]
Seo, Chae Hwa [1 ]
Lim, Byungho [2 ]
Yang, Jin Ok [1 ]
Oh, Jeongsu [1 ]
Kim, Minjin [2 ]
Lee, Sooncheol [2 ]
Lee, Byungwook [1 ]
Kang, Changwon [2 ]
Lee, Sanghyuk [1 ,3 ]
机构
[1] KRIBB, Korean Bioinformat Ctr KOBIC, 111 Gwahangno, Taejon 305806, South Korea
[2] Korea Adv Inst Sci & Technol, Dept Biol Sci, Taejon 305701, South Korea
[3] Ewha Womans Univ, Div Life & Pharmaceut Sci, ERCSB, Seoul 120750, South Korea
关键词
EXPRESSION;
D O I
10.1093/nar/gkq1015
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We propose a novel, efficient and intuitive approach of estimating mRNA abundances from the whole transcriptome shotgun sequencing (RNA-Seq) data. Our method, NEUMA (Normalization by Expected Uniquely Mappable Area), is based on effective length normalization using uniquely mappable areas of gene and mRNA isoform models. Using the known transcriptome sequence model such as RefSeq, NEUMA pre-computes the numbers of all possible gene-wise and isoform-wise informative reads: the former being sequences mapped to all mRNA isoforms of a single gene exclusively and the latter uniquely mapped to a single mRNA isoform. The results are used to estimate the effective length of genes and transcripts, taking experimental distributions of fragment size into consideration. Quantitative RT-PCR based on 27 randomly selected genes in two human cell lines and computer simulation experiments demonstrated superior accuracy of NEUMA over other recently developed methods. NEUMA covers a large proportion of genes and mRNA isoforms and offers a measure of consistency ('consistency coefficient') for each gene between an independently measured gene-wise level and the sum of the isoform levels. NEUMA is applicable to both paired-end and single-end RNA-Seq data. We propose that NEUMA could make a standard method in quantifying gene transcript levels from RNA-Seq data.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] RNA-seq data science: From raw data to effective interpretation
    Deshpande, Dhrithi
    Chhugani, Karishma
    Chang, Yutong
    Karlsberg, Aaron
    Loeffler, Caitlin
    Zhang, Jinyang
    Muszynska, Agata
    Munteanu, Viorel
    Yang, Harry
    Rotman, Jeremy
    Tao, Laura
    Balliu, Brunilda
    Tseng, Elizabeth
    Eskin, Eleazar
    Zhao, Fangqing
    Mohammadi, Pejman
    Labaj, Pawel P.
    Mangul, Serghei
    FRONTIERS IN GENETICS, 2023, 14
  • [32] Comparison and calibration of transcriptome data from RNA-Seq and tiling arrays
    Agarwal, Ashish
    Koppstein, David
    Rozowsky, Joel
    Sboner, Andrea
    Habegger, Lukas
    Hillier, LaDeana W.
    Sasidharan, Rajkumar
    Reinke, Valerie
    Waterston, Robert H.
    Gerstein, Mark
    BMC GENOMICS, 2010, 11
  • [33] Taming the Transcriptome with RNA-Seq
    Stein R.A.
    2018, Mary Ann Liebert Inc. (38): : 10 - 12
  • [34] RNA-seq dissects the transcriptome
    Genet. Eng. Biotechnol. News, 13 (20-22):
  • [35] Comparative evaluation of full-length isoform quantification from RNA-Seq
    Sarantopoulou, Dimitra
    Brooks, Thomas G.
    Nayak, Soumyashant
    Mrcela, Antonijo
    Lahens, Nicholas F.
    Grant, Gregory R.
    BMC BIOINFORMATICS, 2021, 22 (01)
  • [36] Comparative evaluation of full-length isoform quantification from RNA-Seq
    Dimitra Sarantopoulou
    Thomas G. Brooks
    Soumyashant Nayak
    Antonijo Mrčela
    Nicholas F. Lahens
    Gregory R. Grant
    BMC Bioinformatics, 22
  • [37] Accurate inference of isoforms from multiple sample RNA-Seq data
    Tasnim, Masruba
    Ma, Shining
    Yang, Ei-Wen
    Jiang, Tao
    Li, Wei
    BMC GENOMICS, 2015, 16
  • [38] Accurate inference of isoforms from multiple sample RNA-Seq data
    Masruba Tasnim
    Shining Ma
    Ei-Wen Yang
    Tao Jiang
    Wei Li
    BMC Genomics, 16
  • [39] cDNA normalization by hydroxyapatite chromatography to enrich transcriptome diversity in RNA-seq applications
    VanderNoot, Victoria A.
    Langevin, Stanley A.
    Solberg, Owen D.
    Lane, Pamela D.
    Curtis, Deanna J.
    Bent, Zachary W.
    Williams, Kelly P.
    Patel, Kamlesh D.
    Schoeniger, Joseph S.
    Branda, Steven S.
    Lane, Todd W.
    BIOTECHNIQUES, 2012, 53 (06) : 373 - +
  • [40] A graph-based algorithm for RNA-seq data normalization
    Diem-Trang Tran
    Bhaskara, Aditya
    Kuberan, Balagurunathan
    Might, Matthew
    PLOS ONE, 2020, 15 (01):