DOCEST-fast and accurate estimator of human NGS sequencing depth and error rate

被引:0
|
作者
Kaplinski, Lauris [1 ]
Mols, Mart [1 ]
Puurand, Tarmo [1 ]
Remm, Maido [1 ]
机构
[1] Univ Tartu, Inst Mol & Cell Biol, Riia 23, EE-51010 Tartu, Estonia
关键词
D O I
10.1093/bioadv/vbad084
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Motivation: Accurate estimation of next-generation sequencing depth of coverage is needed for detecting the copy number of repeated elements in the human genome. The common methods for estimating sequencing depth are based on counting the number of reads mapped to the genome or subgenomic regions. Such methods are sensitive to the mapping quality. The presence of contamination or the large deviance of an individual genome from the reference may introduce bias in depth estimation. Results: Here, we present an algorithm and implementation for estimating both the sequencing depth and error rate from unmapped reads using a uniquely filtered k-mer set. On simulated reads with 20x coverage, the margin of error was less than 0.01%. At 0.01x coverage and the presence of 10-fold contamination, the precision was within 2% for depth and within 10% for error rate. Availability and implementation: DOCEST program and database can be downloaded from https://bioinfo.ut.ee/docest/.
引用
收藏
页数:4
相关论文
共 17 条
  • [1] Quantifying sequencing error and effective sequencing depth of liquid biopsy NGS with UMI error correction
    Frank, Malene Stochkel
    Fuss, Janina
    Steiert, Tim Alexander
    Streleckiene, Greta
    Gehl, Julie
    Forster, Michael
    BIOTECHNIQUES, 2021, 70 (04) : 226 - 232
  • [2] Seekmer: fast and accurate transcript level quantification at low sequencing depth
    Wang, Y.
    Zhang, H.
    Mills, R. E.
    Guan, Y.
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2019, 27 : 591 - 592
  • [3] Trowel: a fast and accurate error correction module for Illumina sequencing reads
    Lim, Eun-Cheon
    Mueller, Jonas
    Hagmann, Joerg
    Henz, Stefan R.
    Kim, Sang-Tae
    Weigel, Detlef
    BIOINFORMATICS, 2014, 30 (22) : 3264 - 3265
  • [4] CARROT - A tool for fast and accurate soft error rate estimation
    Bountas, Dimitrios
    Stamoulis, Georgios I.
    EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, PROCEEDINGS, 2006, 4017 : 331 - 338
  • [5] Accurate and fast methods to estimate the population mutation rate from error prone sequences
    Knudsen, Bjarne
    Miyamoto, Michael M.
    BMC BIOINFORMATICS, 2009, 10 : 247
  • [6] Accurate and fast methods to estimate the population mutation rate from error prone sequences
    Bjarne Knudsen
    Michael M Miyamoto
    BMC Bioinformatics, 10
  • [7] A Fast and Accurate Multi-Cycle Soft Error Rate Estimation Approach to Resilient Embedded Systems Design
    Fazeli, Mahdi
    Miremadi, Seyed Ghassem
    Asadi, Hossein
    Ahmadian, Seyed Nematollah
    2010 IEEE-IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS DSN, 2010, : 131 - 140
  • [8] DEVELOPMENT OF A NOVEL NEXT-GEN SEQUENCING (NGS) METHODOLOGY FOR ACCURATE CHARACTERIZATION OF GENOME-WIDE MITOCHONDRIAL HETEROPLASMY IN HUMAN EMBRYOS
    Hong, K. H.
    Taylor, D. M.
    Forman, E.
    Tao, X.
    Treff, N. R.
    Scott, R.
    FERTILITY AND STERILITY, 2012, 98 (03) : S58 - S59
  • [9] Fast trimer statistics facilitate accurate decoding of large random DNA barcode sets even at large sequencing error rates
    Press, William H.
    PNAS NEXUS, 2022, 1 (05):
  • [10] Long-read nanopore sequencing provides fast and accurate identification of genetic variants in the human PRNP gene
    Athanasios, Dimitriadis
    Kroll, Francois
    Campbell, Tracy
    Collinge, John
    Mead, Simon
    Vire, Emmanuelle
    Collinge, John
    PRION, 2019, 13 : 56 - 56