LoRDEC: accurate and efficient long read error correction

被引:526
|
作者
Salmela, Leena [1 ,2 ]
Rivals, Eric [3 ,4 ,5 ]
机构
[1] Univ Helsinki, Dept Comp Sci, FI-00014 Helsinki, Finland
[2] Univ Helsinki, Helsinki Inst Informat Technol, FI-00014 Helsinki, Finland
[3] LIRMM, F-34095 Montpellier 5, France
[4] CNRS, Inst Biol Computat, F-34095 Montpellier 5, France
[5] Univ Montpellier, F-34095 Montpellier 5, France
基金
芬兰科学院;
关键词
BASIC LOCAL ALIGNMENT; GENOME ASSEMBLIES;
D O I
10.1093/bioinformatics/btu538
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: PacBio single molecule real-time sequencing is a third-generation sequencing technique producing long reads, with comparatively lower throughput and higher error rate. Errors include numerous indels and complicate downstream analysis like mapping or de novo assembly. A hybrid strategy that takes advantage of the high accuracy of second-generation short reads has been proposed for correcting long reads. Mapping of short reads on long reads provides sufficient coverage to eliminate up to 99% of errors, however, at the expense of prohibitive running times and considerable amounts of disk and memory space. Results: We present LoRDEC, a hybrid error correction method that builds a succinct de Bruijn graph representing the short reads, and seeks a corrective sequence for each erroneous region in the long reads by traversing chosen paths in the graph. In comparison, LoRDEC is at least six times faster and requires at least 93% less memory or disk space than available tools, while achieving comparable accuracy.
引用
收藏
页码:3506 / 3514
页数:9
相关论文
共 50 条
  • [31] CARE: context-aware sequencing read error correction
    Kallenborn, Felix
    Hildebrandt, Andreas
    Schmidt, Bertil
    BIOINFORMATICS, 2021, 37 (07) : 889 - 895
  • [32] A efficient algorithm for spectrum error correction
    Zhu, L.
    Xiong, Y.
    Zhendong Gongcheng Xuebao/Journal of Vibration Engineering, 2001, 14 (02): : 166 - 171
  • [33] Efficient diagnostics for quantum error correction
    Iyer, Pavithran
    Jain, Aditya
    Bartlett, Stephen D.
    Emerson, Joseph
    PHYSICAL REVIEW RESEARCH, 2022, 4 (04):
  • [34] Update Efficient Codes for Error Correction
    Mazumdar, Arya
    Wornell, Gregory W.
    Chandar, Venkat
    2012 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY PROCEEDINGS (ISIT), 2012,
  • [35] Lerna: transformer architectures for configuring error correction tools for short-and long-read genome sequencing
    Sharma, Atul
    Jain, Pranjal
    Mahgoub, Ashraf
    Zhou, Zihan
    Mahadik, Kanak
    Chaterji, Somali
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [36] Comparative assessment of long-read error correction software applied to Nanopore RNA-sequencing data
    Lima, Leandro
    Marchet, Camille
    Caboche, Segolene
    Da Silva, Corinne
    Istace, Benjamin
    Aury, Jean-Marc
    Touzet, Helene
    Chikhi, Rayan
    BRIEFINGS IN BIOINFORMATICS, 2020, 21 (04) : 1164 - 1181
  • [37] Lerna: transformer architectures for configuring error correction tools for short- and long-read genome sequencing
    Atul Sharma
    Pranjal Jain
    Ashraf Mahgoub
    Zihan Zhou
    Kanak Mahadik
    Somali Chaterji
    BMC Bioinformatics, 23
  • [38] Local read haplotagging enables accurate long-read small variant calling
    Kolesnikov, Alexey
    Cook, Daniel
    Nattestad, Maria
    Brambrink, Lucas
    McNulty, Brandy
    Gorzynski, John
    Goenka, Sneha
    Ashley, Euan A.
    Jain, Miten
    Miga, Karen H.
    Paten, Benedict
    Chang, Pi-Chuan
    Carroll, Andrew
    Shafin, Kishwar
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [39] Optimization of Decoder Priors for Accurate Quantum Error Correction
    Sivak, Volodymyr
    Newman, Michael
    Klimov, Paul
    PHYSICAL REVIEW LETTERS, 2024, 133 (15)
  • [40] SvAnna: efficient and accurate pathogenicity prediction of coding and regulatory structural variants in long-read genome sequencing
    Danis, Daniel
    Jacobsen, Julius O. B.
    Balachandran, Parithi
    Zhu, Qihui
    Yilmaz, Feyza
    Reese, Justin
    Haimel, Matthias
    Lyon, Gholson J.
    Helbig, Ingo
    Mungall, Christopher J.
    Beck, Christine R.
    Lee, Charles
    Smedley, Damian
    Robinson, Peter N.
    GENOME MEDICINE, 2022, 14 (01)