Next-Generation Sequence Assembly: Four Stages of Data Processing and Computational Challenges

被引:79
|
作者
El-Metwally, Sara [1 ]
Hamza, Taher [1 ]
Zakaria, Magdi [1 ]
Helmy, Mohamed [2 ,3 ]
机构
[1] Mansoura Univ, Dept Comp Sci, Fac Comp & Informat, Mansoura, Egypt
[2] Al Azhar Univ, Dept Bot, Fac Agr, Cairo, Egypt
[3] Al Azhar Univ, Fac Agr, Dept Biotechnol, Cairo, Egypt
关键词
READ ERROR-CORRECTION; SHORT DNA-SEQUENCES; DE-BRUIJN GRAPHS; GENOME SEQUENCE; STRING GRAPH; PAIRED READS; ALGORITHM; TECHNOLOGIES; VELVET; PLATFORMS;
D O I
10.1371/journal.pcbi.1003345
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Decoding DNA symbols using next-generation sequencers was a major breakthrough in genomic research. Despite the many advantages of next-generation sequencers, e.g., the high-throughput sequencing rate and relatively low cost of sequencing, the assembly of the reads produced by these sequencers still remains a major challenge. In this review, we address the basic framework of next-generation genome sequence assemblers, which comprises four basic stages: preprocessing filtering, a graph construction process, a graph simplification process, and postprocessing filtering. Here we discuss them as a framework of four stages for data analysis and processing and survey variety of techniques, algorithms, and software tools used during each stage. We also discuss the challenges that face current assemblers in the next-generation environment to determine the current state-of-the-art. We recommend a layered architecture approach for constructing a general assembler that can handle the sequences generated by different sequencing platforms.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] PVT: An Efficient Computational Procedure to Speed up Next-generation Sequence Analysis
    Maji, Ranjan Kumar
    Sarkar, Arijita
    Khatua, Sunirmal
    Dasgupta, Subhasis
    Ghosh, Zhumur
    BMC BIOINFORMATICS, 2014, 15
  • [32] Next-Generation DNA Assembly Tools
    Peng, Lansha
    Tsvetanova, Billyana
    Liang, Xiquan
    Katzen, Federico
    GENETIC ENGINEERING & BIOTECHNOLOGY NEWS, 2010, 30 (18): : 32 - 33
  • [33] Next-generation DNA assembly tools
    Peng, Lansha
    Tsvetanova, Billyana
    Liang, Xiquan
    Katzen, Federico
    Genetic Engineering and Biotechnology News, 2010, 30 (18):
  • [34] A next-generation human genome sequence
    Church, Deanna M.
    SCIENCE, 2022, 376 (6588) : 34 - 35
  • [35] Support the Data Enthusiast: Challenges for Next-Generation Data-Analysis Systems
    Morton, Kristi
    Balazinska, Magdalena
    Grossman, Dan
    Mackinlay, Jock
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (06): : 453 - 456
  • [36] Novel Computational Technologies for Next-Generation Sequencing Data Analysis and Their Applications
    Tang, Chuan Yi
    Hung, Che-Lun
    Zheng, Huiru
    Lin, Chun-Yuan
    Jiang, Hai
    INTERNATIONAL JOURNAL OF GENOMICS, 2015, 2015
  • [37] De novo assembly of transcriptome from next-generation sequencing data
    Xuan Li
    Yimeng Kong
    QiongYi Zhao
    YuanYuan Li
    Pei Hao
    Quantitative Biology, 2016, 4 (02) : 94 - 105
  • [38] Is next-generation radiologist ready for the challenges?
    Mohan, Chander S. M.
    INDIAN JOURNAL OF RADIOLOGY AND IMAGING, 2019, 29 (01): : 2 - 3
  • [39] Key challenges for next-generation pharmacogenomics
    Kampourakis, Kostas
    Vayena, Effy
    Mitropoulou, Christina
    van Schaik, Ron H.
    Cooper, David N.
    Borg, Joseph
    Patrinos, George P.
    EMBO REPORTS, 2014, 15 (05) : 472 - 476
  • [40] Testing Challenges for Next-Generation Radios
    Pleasant, Dan
    2008 IEEE AUTOTESTCON, VOLS 1 AND 2, 2008, : 371 - 375