Direct comparison of performance of single nucleotide variant calling in human genome with alignment-based and assembly-based approaches

被引:0
|
作者
Leihong Wu
Gokhan Yavas
Huixiao Hong
Weida Tong
Wenming Xiao
机构
[1] National Center for Toxicological Research,
[2] US Food and Drug Administration,undefined
来源
Scientific Reports | / 7卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Complementary to reference-based variant detection, recent studies revealed that many novel variants could be detected with de novo assembled genomes. To evaluate the effect of reads coverage and the accuracy of assembly-based variant calling, we simulated short reads containing more than 3 million of single nucleotide variants (SNVs) from the whole human genome and compared the efficiency of SNV calling between the assembly-based and alignment-based calling approaches. We assessed the quality of the assembled contig and found that a minimum of 30X coverage of short reads was needed to ensure reliable SNV calling and to generate assembled contigs with a good coverage of genome and genes. In addition, we observed that the assembly-based approach had a much lower recall rate and precision comparing to the alignment-based approach that would recover 99% of imputed SNVs. We observed similar results with experimental reads for NA24385, an individual whose germline variants were well characterized. Although there are additional values for SNVs detection, the assembly-based approach would have great risk of false discovery of novel SNVs. Further improvement of de novo assembly algorithms are needed in order to warrant a good completeness of genome with haplotype resolved and high fidelity of assembled sequences.
引用
收藏
相关论文
共 29 条
  • [1] Direct comparison of performance of single nucleotide variant calling in human genome with alignment-based and assembly-based approaches
    Wu, Leihong
    Yavas, Gokhan
    Hong, Huixiao
    Tong, Weida
    Xiao, Wenming
    SCIENTIFIC REPORTS, 2017, 7
  • [2] FermiKit: assembly-based variant calling for Illumina resequencing data
    Li, Heng
    BIOINFORMATICS, 2015, 31 (22) : 3694 - 3696
  • [3] Benchmarking datasets for assembly-based variant calling using high-fidelity long reads
    Hyunji Lee
    Jun Kim
    Junho Lee
    BMC Genomics, 24
  • [4] Performance assessment of de novo assembly-based structural variation detection in the human genome
    Xiao, Chunlin
    CANCER RESEARCH, 2018, 78 (13)
  • [5] Benchmarking datasets for assembly-based variant calling using high-fidelity long reads
    Lee, Hyunji
    Kim, Jun
    Lee, Junho
    BMC GENOMICS, 2023, 24 (01)
  • [6] Handling the Alignment for Wake Word Detection: A Comparison Between Alignment-Based, Alignment-Free and Hybrid Approaches
    Ribeiro, Vinicius
    Huang, Yiteng
    Yuan Shangguan
    Yang, Zhaojun
    Wan, Li
    Sun, Ming
    INTERSPEECH 2023, 2023, : 5366 - 5370
  • [7] Tradeoffs in alignment and assembly-based methods for structural variant detection with long-read sequencing data
    Liu, Yichen Henry
    Luo, Can
    Golding, Staunton G.
    Ioffe, Jacob B.
    Zhou, Xin Maizie
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [8] Tradeoffs in alignment and assembly-based methods for structural variant detection with long-read sequencing data
    Yichen Henry Liu
    Can Luo
    Staunton G. Golding
    Jacob B. Ioffe
    Xin Maizie Zhou
    Nature Communications, 15
  • [9] DAVI: Deep learning-based tool for alignment and single nucleotide variant identification
    Gupta, G.
    Saini, S.
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2020, 1 (02):
  • [10] Aquila_stLFR: diploid genome assembly based structural variant calling package for stLFR linked-reads
    Liu, Yichen Henry
    Grubbs, Griffin L.
    Zhang, Lu
    Fang, Xiaodong
    Dill, David L.
    Sidow, Arend
    Zhou, Xin
    BIOINFORMATICS ADVANCES, 2021, 1 (01):