Transformation of alignment files improves performance of variant callers for long-read RNA sequencing data

被引:0
|
作者
Vladimir B. C. de Souza
Ben T. Jordan
Elizabeth Tseng
Elizabeth A. Nelson
Karen K. Hirschi
Gloria Sheynkman
Mark D. Robinson
机构
[1] University of Zurich,Department of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics
[2] University of Virginia,Department of Molecular Physiology and Biological Physics
[3] PacBio,Department of Cell Biology and Cardiovascular Research Center
[4] University of Virginia School of Medicine,Department of Medicine
[5] Yale University School of Medicine,Department of Genetics
[6] Yale University School of Medicine,Department of Biochemistry and Molecular Genetics
[7] Yale Cardiovascular Research Center,Center for Public Health Genomics
[8] Yale University School of Medicine,UVA Comprehensive Cancer Center
[9] University of Virginia,undefined
[10] University of Virginia,undefined
[11] University of Virginia,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Long-read RNA sequencing (lrRNA-seq) produces detailed information about full-length transcripts, including novel and sample-specific isoforms. Furthermore, there is an opportunity to call variants directly from lrRNA-seq data. However, most state-of-the-art variant callers have been developed for genomic DNA. Here, there are two objectives: first, we perform a mini-benchmark on GATK, DeepVariant, Clair3, and NanoCaller primarily on PacBio Iso-Seq, data, but also on Nanopore and Illumina RNA-seq data; second, we propose a pipeline to process spliced-alignment files, making them suitable for variant calling with DNA-based callers. With such manipulations, high calling performance can be achieved using DeepVariant on Iso-seq data.
引用
收藏
相关论文
共 50 条
  • [21] Variant phasing and haplotypic expression from long-read sequencing in maize
    Bo Wang
    Elizabeth Tseng
    Primo Baybayan
    Kevin Eng
    Michael Regulski
    Yinping Jiao
    Liya Wang
    Andrew Olson
    Kapeel Chougule
    Peter Van Buren
    Doreen Ware
    Communications Biology, 3
  • [22] Variant phasing and haplotypic expression from long-read sequencing in maize
    Wang, Bo
    Tseng, Elizabeth
    Baybayan, Primo
    Eng, Kevin
    Regulski, Michael
    Jiao, Yinping
    Wang, Liya
    Olson, Andrew
    Chougule, Kapeel
    Van Buren, Peter
    Ware, Doreen
    COMMUNICATIONS BIOLOGY, 2020, 3 (01)
  • [23] Long-read RNA sequencing of human and animal filarial parasites improves gene models and discovers operons
    Wheeler, Nicolas J.
    Airs, Paul M.
    Zamanian, Mostafa
    PLOS NEGLECTED TROPICAL DISEASES, 2020, 14 (11): : 1 - 22
  • [24] Integration of hybrid and self-correction method improves the quality of long-read sequencing data
    Tang, Tao
    Liu, Yiping
    Zheng, Binshuang
    Li, Rong
    Zhang, Xiaocai
    Liu, Yuansheng
    BRIEFINGS IN FUNCTIONAL GENOMICS, 2024, 23 (03) : 249 - 255
  • [25] The long and the short of it: unlocking nanopore long-read RNA sequencing data with short-read differential expression analysis tools
    Dong, Xueyi
    Tian, Luyi
    Gouil, Quentin
    Kariyawasam, Hasaru
    Su, Shian
    De Paoli-Iseppi, Ricardo
    Prawer, Yair David Joseph
    Clark, Michael B.
    Breslin, Kelsey
    Iminitoff, Megan
    Blewitt, Marnie E.
    Law, Charity W.
    Ritchie, Matthew E.
    NAR GENOMICS AND BIOINFORMATICS, 2021, 3 (02)
  • [26] Identification and quantification of small exon-containing isoforms in long-read RNA sequencing data
    Liu, Zhen
    Zhu, Chenchen
    Steinmetz, Lars M.
    Wei, Wu
    NUCLEIC ACIDS RESEARCH, 2023, 51 (20) : E104
  • [27] Detecting Phase Effects Using Long-Read Sequencing Data
    He, Gengming
    Mastromatteo, Scott
    Keenan, Katherine
    Strug, Lisa
    GENETIC EPIDEMIOLOGY, 2024, 48 (07) : 360 - 360
  • [28] The variables on RNA molecules: concert or cacophony? Answers in long-read sequencing
    Careen Foord
    Justine Hsu
    Julien Jarroux
    Wen Hu
    Natan Belchikov
    Shaun Pollard
    Yi He
    Anoushka Joglekar
    Hagen U. Tilgner
    Nature Methods, 2023, 20 : 20 - 24
  • [29] NanoGalaxy: Nanopore long-read sequencing data analysis in Galaxy
    de Koning, Willem
    Miladi, Milad
    Hiltemann, Saskia
    Heikema, Astrid
    Hays, John P.
    Flemming, Stephan
    van den Beek, Marius
    Mustafa, Dana A.
    Backofen, Rolf
    Gruening, Bjoern
    Stubbs, Andrew P.
    GIGASCIENCE, 2020, 9 (10):
  • [30] The variables on RNA molecules: concert or cacophony? Answers in long-read sequencing
    Foord, Careen
    Hsu, Justine
    Jarroux, Julien
    Hu, Wen
    Belchikov, Natan
    Pollard, Shaun
    He, Yi
    Joglekar, Anoushka
    Tilgner, Hagen U.
    NATURE METHODS, 2023, 20 (01) : 20 - 24