Linear time complexity de novo long read genome assembly with GoldRush

被引:0
|
作者
Johnathan Wong
Lauren Coombe
Vladimir Nikolić
Emily Zhang
Ka Ming Nip
Puneet Sidhu
René L. Warren
Inanç Birol
机构
[1] Canada’s Michael Smith Genome Sciences Centre,
[2] BC Cancer,undefined
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Current state-of-the-art de novo long read genome assemblers follow the Overlap-Layout-Consensus paradigm. While read-to-read overlap – its most costly step – was improved in modern long read genome assemblers, these tools still often require excessive RAM when assembling a typical human dataset. Our work departs from this paradigm, foregoing all-vs-all sequence alignments in favor of a dynamic data structure implemented in GoldRush, a de novo long read genome assembly algorithm with linear time complexity. We tested GoldRush on Oxford Nanopore Technologies long sequencing read datasets with different base error profiles sourced from three human cell lines, rice, and tomato. Here, we show that GoldRush achieves assembly scaffold NGA50 lengths of 18.3-22.2, 0.3 and 2.6 Mbp, for the genomes of human, rice, and tomato, respectively, and assembles each genome within a day, using at most 54.5 GB of random-access memory, demonstrating the scalability of our genome assembly paradigm and its implementation.
引用
收藏
相关论文
共 50 条
  • [1] Linear time complexity de novo long read genome assembly with GoldRush
    Wong, Johnathan
    Coombe, Lauren
    Nikolic, Vladimir
    Zhang, Emily
    Nip, Ka Ming
    Sidhu, Puneet
    Warren, Rene L.
    Birol, Inanc
    NATURE COMMUNICATIONS, 2023, 14 (01)
  • [2] Long-read sequencing and de novo assembly of a Chinese genome
    Lingling Shi
    Yunfei Guo
    Chengliang Dong
    John Huddleston
    Hui Yang
    Xiaolu Han
    Aisi Fu
    Quan Li
    Na Li
    Siyi Gong
    Katherine E. Lintner
    Qiong Ding
    Zou Wang
    Jiang Hu
    Depeng Wang
    Feng Wang
    Lin Wang
    Gholson J. Lyon
    Yongtao Guan
    Yufeng Shen
    Oleg V. Evgrafov
    James A. Knowles
    Francoise Thibaud-Nissen
    Valerie Schneider
    Chack-Yung Yu
    Libing Zhou
    Evan E. Eichler
    Kwok-Fai So
    Kai Wang
    Nature Communications, 7
  • [3] Long-read sequencing and de novo assembly of a Chinese genome
    Shi, Lingling
    Guo, Yunfei
    Dong, Chengliang
    Huddleston, John
    Yang, Hui
    Han, Xiaolu
    Fu, Aisi
    Li, Quan
    Li, Na
    Gong, Siyi
    Lintner, Katherine E.
    Ding, Qiong
    Wang, Zou
    Hu, Jiang
    Wang, Depeng
    Wang, Feng
    Wang, Lin
    Lyon, Gholson J.
    Guan, Yongtao
    Shen, Yufeng
    Evgrafov, Oleg V.
    Knowles, James A.
    Thibaud-Nissen, Francoise
    Schneider, Valerie
    Yu, Chack-Yung
    Zhou, Libing
    Eichler, Evan E.
    So, Kwok-Fai
    Wang, Kai
    NATURE COMMUNICATIONS, 2016, 7
  • [4] Long-read sequencing and de novo assembly of the cynomolgus macaque genome
    Bai, Bing
    Wang, Yi
    Zhu, Ran
    Zhang, Yaolei
    Wang, Hong
    Fan, Guangyi
    Liu, Xin
    Shi, Hong
    Niu, Yuyu
    Ji, Weizhi
    JOURNAL OF GENETICS AND GENOMICS, 2022, 49 (10) : 975 - 978
  • [5] Long-read sequencing and de novo assembly of the cynomolgus macaque genome
    Bing Bai
    Yi Wang
    Ran Zhu
    Yaolei Zhang
    Hong Wang
    Guangyi Fan
    Xin Liu
    Hong Shi
    Yuyu Niu
    Weizhi Ji
    JournalofGeneticsandGenomics, 2022, 49 (10) : 975 - 978
  • [6] Long-read de novo genome assembly of Gulf toadfish (Opsanus beta)
    Kron, Nicholas S.
    Young, Benjamin D.
    Drown, Melissa K.
    Mcdonald, M. Danielle
    BMC GENOMICS, 2024, 25 (01):
  • [7] De novo chromosome level assembly of a plant genome from long read sequence data
    Sharma, Priyanka
    Masouleh, Ardashir Kharabian
    Topp, Bruce
    Furtado, Agnelo
    Henry, Robert J.
    PLANT JOURNAL, 2022, 109 (03): : 727 - 736
  • [8] Long-read Sequencing and de novo Genome Assembly of Three Aspergillus fumigatus Genomes
    Samuel J. Hemmings
    Johanna L. Rhodes
    Matthew C. Fisher
    Mycopathologia, 2023, 188 : 409 - 412
  • [9] Long-read sequencing and de novo genome assembly of marine medaka (Oryzias melastigma)
    Pingping Liang
    Hafiz Sohaib Ahmed Saqib
    Xiaomin Ni
    Yingjia Shen
    BMC Genomics, 21
  • [10] Long-read sequencing and de novo genome assembly of Ammopiptanthus nanus, a desert shrub
    Gao, Fei
    Wang, Xue
    Li, Xuming
    Xu, Mingyue
    Li, Huayun
    Abla, Merhaba
    Sun, Huigai
    Wei, Shanjun
    Feng, Jinchao
    Zhou, Yijun
    GIGASCIENCE, 2018, 7 (07):