Bazam: a rapid method for read extraction and realignment of high-throughput sequencing data

被引:0
|
作者
Simon P. Sadedin
Alicia Oshlack
机构
[1] Royal Children’s Hospital,Bioinformatics, Murdoch Children’s Research Institute
[2] Royal Children’s Hospital,Victorian Clinical Genetics Services
[3] University of Melbourne,Department of BioScience
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
The vast quantities of short-read sequencing data being generated are often exchanged and stored as aligned reads. However, aligned data becomes outdated as new reference genomes and alignment methods become available. Here we describe Bazam, a tool that efficiently extracts the original paired FASTQ from alignment files (BAM or CRAM format) in a format that directly allows efficient realignment. Bazam facilitates up to a 90% reduction in the time for realignment compared to standard methods. Bazam can support selective extraction of read pairs from focused genomic regions for applications such as targeted region analyses, quality control, structural variant calling, and alignment comparisons.
引用
收藏
相关论文
共 50 条
  • [21] Need for speed in high-throughput sequencing data analysis
    Pluss, M.
    Caspar, S. M.
    Meienberg, J.
    Kopps, A. M.
    Keller, I.
    Bruggmann, R.
    Vogel, M.
    Matyas, G.
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2018, 26 : 721 - 722
  • [22] Genome variation discovery with high-throughput sequencing data
    Dalca, Adrian V.
    Brudno, Michael
    BRIEFINGS IN BIOINFORMATICS, 2010, 11 (01) : 3 - 14
  • [23] Savant: genome browser for high-throughput sequencing data
    Fiume, Marc
    Williams, Vanessa
    Brook, Andrew
    Brudno, Michael
    BIOINFORMATICS, 2010, 26 (16) : 1938 - 1944
  • [24] Comparison of high-throughput sequencing data compression tools
    Ibrahim Numanagić
    James K Bonfield
    Faraz Hach
    Jan Voges
    Jörn Ostermann
    Claudio Alberti
    Marco Mattavelli
    S Cenk Sahinalp
    Nature Methods, 2016, 13 : 1005 - 1008
  • [25] Quality assessment and control of high-throughput sequencing data
    Watson, Mick
    FRONTIERS IN GENETICS, 2014, 5
  • [26] A rapid and inexpensive RNA-extraction method for high-throughput virus detection in grapevine
    Steinmetz, N.
    Michl, G.
    Maixner, M.
    Hoffmann, C.
    VITIS, 2020, 59 (01) : 35 - 39
  • [27] A Family-Based Probabilistic Method for Capturing De Novo Mutations from High-Throughput Short-Read Sequencing Data
    Cartwright, Reed A.
    Hussin, Julie
    Keebler, Jonathan E. M.
    Stone, Eric A.
    Awadalla, Philip
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2012, 11 (02):
  • [28] Haplotyping germline and cancer genomes with high-throughput linked-read sequencing
    Zheng, Grace X. Y.
    Lau, Billy T.
    Schnall-Levin, Michael
    Jarosz, Mirna
    Bell, John M.
    Hindson, Christopher M.
    Kyriazopoulou-Panagiotopoulou, Sofia
    Masquelier, Donald A.
    Merrill, Landon
    Terry, Jessica M.
    Mudivarti, Patrice A.
    Wyatt, Paul W.
    Bharadwaj, Rajiv
    Makarewicz, Anthony J.
    Li, Yuan
    Belgrader, Phillip
    Price, Andrew D.
    Lowe, Adam J.
    Marks, Patrick
    Vurens, Gerard M.
    Hardenbol, Paul
    Montesclaros, Luz
    Luo, Melissa
    Greenfield, Lawrence
    Wong, Alexander
    Birch, David E.
    Short, Steven W.
    Bjornson, Keith P.
    Patel, Pranav
    Hopmans, Erik S.
    Wood, Christina
    Kaur, Sukhvinder
    Lockwood, Glenn K.
    Stafford, David
    Delaney, Joshua P.
    Wu, Indira
    Ordonez, Heather S.
    Grimes, Susan M.
    Greer, Stephanie
    Lee, Josephine Y.
    Belhocine, Kamila
    Giorda, Kristina M.
    Heaton, William H.
    McDermott, Geoffrey P.
    Bent, Zachary W.
    Meschi, Francesca
    Kondov, Nikola O.
    Wilson, Ryan
    Bernate, Jorge A.
    Gauby, Shawn
    NATURE BIOTECHNOLOGY, 2016, 34 (03) : 303 - +
  • [29] Haplotyping germline and cancer genomes with high-throughput linked-read sequencing
    Grace X Y Zheng
    Billy T Lau
    Michael Schnall-Levin
    Mirna Jarosz
    John M Bell
    Christopher M Hindson
    Sofia Kyriazopoulou-Panagiotopoulou
    Donald A Masquelier
    Landon Merrill
    Jessica M Terry
    Patrice A Mudivarti
    Paul W Wyatt
    Rajiv Bharadwaj
    Anthony J Makarewicz
    Yuan Li
    Phillip Belgrader
    Andrew D Price
    Adam J Lowe
    Patrick Marks
    Gerard M Vurens
    Paul Hardenbol
    Luz Montesclaros
    Melissa Luo
    Lawrence Greenfield
    Alexander Wong
    David E Birch
    Steven W Short
    Keith P Bjornson
    Pranav Patel
    Erik S Hopmans
    Christina Wood
    Sukhvinder Kaur
    Glenn K Lockwood
    David Stafford
    Joshua P Delaney
    Indira Wu
    Heather S Ordonez
    Susan M Grimes
    Stephanie Greer
    Josephine Y Lee
    Kamila Belhocine
    Kristina M Giorda
    William H Heaton
    Geoffrey P McDermott
    Zachary W Bent
    Francesca Meschi
    Nikola O Kondov
    Ryan Wilson
    Jorge A Bernate
    Shawn Gauby
    Nature Biotechnology, 2016, 34 : 303 - 311
  • [30] Perspectives and Benefits of High-Throughput Long-Read Sequencing in Microbial Ecology
    Tedersoo, Leho
    Albertsen, Math
    Anslan, Sten
    Callahan, Benjamin
    APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2021, 87 (17) : 1 - 19