AsmMix: an efficient haplotype-resolved hybrid de novo genome assembling pipeline

被引:0
|
作者
Liu, Chao [1 ,2 ]
Wu, Pei [1 ,2 ]
Wu, Xue [2 ]
Zhao, Xia [3 ]
Chen, Fang [3 ]
Cheng, Xiaofang [3 ]
Zhu, Hongmei [1 ,2 ]
Wang, Ou [2 ]
Xu, Mengyang [2 ,4 ]
机构
[1] BGI, Tianjin, Peoples R China
[2] BGI Res, Shenzhen, Peoples R China
[3] MGI Tech, Shenzhen, Peoples R China
[4] BGI Res, Qingdao, Peoples R China
基金
中国国家自然科学基金;
关键词
long reads; bioinformatics; de novo; genome assembly; haplotype; hybrid; LONG; ACCURATE; READS;
D O I
10.3389/fgene.2024.1421565
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Accurate haplotyping facilitates distinguishing allele-specific expression, identifying cis-regulatory elements, and characterizing genomic variations, which enables more precise investigations into the relationship between genotype and phenotype. Recent advances in third-generation single-molecule long read and synthetic co-barcoded read sequencing techniques have harnessed long-range information to simplify the assembly graph and improve assembly genomic sequence. However, it remains methodologically challenging to reconstruct the complete haplotypes due to high sequencing error rates of long reads and limited capturing efficiency of co-barcoded reads. We here present a pipeline, AsmMix, for generating both contiguous and accurate diploid genomes. It first assembles co-barcoded reads to generate accurate haplotype-resolved assemblies that may contain many gaps, while the long-read assembly is contiguous but susceptible to errors. Then two assembly sets are integrated into haplotype-resolved assemblies with reduced misassembles. Through extensive evaluation on multiple synthetic datasets, AsmMix consistently demonstrates high precision and recall rates for haplotyping across diverse sequencing platforms, coverage depths, read lengths, and read accuracies, significantly outperforming other existing tools in the field. Furthermore, we validate the effectiveness of our pipeline using a human whole genome dataset (HG002), and produce highly contiguous, accurate, and haplotype-resolved assemblies. These assemblies are evaluated using the GIAB benchmarks, confirming the accuracy of variant calling. Our results demonstrate that AsmMix offers a straightforward yet highly efficient approach that effectively leverages both long reads and co-barcoded reads for haplotype-resolved assembly.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] The haplotype-resolved genome and epigenome of the aneuploid HeLa cancer cell line
    Andrew Adey
    Joshua N. Burton
    Jacob O. Kitzman
    Joseph B. Hiatt
    Alexandra P. Lewis
    Beth K. Martin
    Ruolan Qiu
    Choli Lee
    Jay Shendure
    Nature, 2013, 500 : 207 - 211
  • [32] A haplotype-resolved, de novo genome assembly for the wood tiger moth (Arctia plantaginis) through trio binning (vol 9, pg 1, 2020)
    Yen, Eugenie C.
    McCarthy, Shane A.
    Galarza, Juan A.
    Generalovic, Tomas N.
    Pelan, Sarah
    Nguyen, Petr
    Meier, Joana I.
    Warren, Ian A.
    Mappes, Johanna
    Durbin, Richard
    Jiggins, Chris D.
    GIGASCIENCE, 2021, 10 (10):
  • [33] Chromosome-Scale, Haplotype-Resolved Genome Assembly of Suaeda Glauca
    Yi, Liuxi
    Sa, Rula
    Zhao, Shuwen
    Zhang, Xiaoming
    Lu, Xudong
    Mu, Yingnan
    Bateer, Siqin
    Su, Shaofeng
    Wang, Shuyan
    Li, Zhiwei
    Shi, Shude
    Zhao, Xiaoqing
    Lu, Zhanyuan
    FRONTIERS IN GENETICS, 2022, 13
  • [34] Haplotype-resolved chromosome-level genome assembly of Ehretia macrophylla
    Cheng, Shiping
    Zhang, Qikun
    Geng, Xining
    Xie, Lihua
    Chen, Minghui
    Jiao, Siqian
    Qi, Shuaizheng
    Yao, Pengqiang
    Lu, Mailin
    Zhang, Mengren
    Zhai, Wenshan
    Yun, Quanzheng
    Feng, Shangguo
    SCIENTIFIC DATA, 2024, 11 (01)
  • [35] Chromosome-level and haplotype-resolved genome assembly of Bougainvillea glabra
    Lan, Lan
    Li, Haiyan
    Xu, Shisong
    Xu, Yueting
    Leng, Qingyun
    Zhang, Linbi
    Wu, Linqiao
    Yin, Junmei
    Wu, Zhiqiang
    Niu, Junhai
    SCIENTIFIC DATA, 2025, 12 (01)
  • [36] Telomere-to-telomere and haplotype-resolved genome of the kiwifruit Actinidia eriantha
    Yingzhen Wang
    Minhui Dong
    Ying Wu
    Feng Zhang
    Wangmei Ren
    Yunzhi Lin
    Qinyao Chen
    Sijia Zhang
    Junyang Yue
    Yongsheng Liu
    Molecular Horticulture, 3
  • [37] Haplotype-resolved genome and population genomics of the threatened garden dormouse in Europe
    Byerly, Paige A.
    von Thaden, Alina
    Leushkin, Evgeny
    Hilgers, Leon
    Liu, Shenglin
    Winter, Sven
    Schell, Tilman
    Gerheim, Charlotte
    Ben Hamadou, Alexander
    Greve, Carola
    Betz, Christian
    Bolz, Hanno J.
    Buechner, Sven
    Lang, Johannes
    Meinig, Holger
    Famira-Parcsetich, Evax Marie
    Stubbe, Sarah P.
    Mouton, Alice
    Bertolino, Sandro
    Verbeylen, Goedele
    Briner, Thomas
    Freixas, Lidia
    Vinciguerra, Lorenzo
    Mueller, Sarah A.
    Nowak, Carsten
    Hiller, Michael
    GENOME RESEARCH, 2024, 34 (11) : 2094 - 2107
  • [38] Telomere-to-telomere and haplotype-resolved genome of the kiwifruit Actinidia eriantha
    Wang, Yingzhen
    Dong, Minhui
    Wu, Ying
    Zhang, Feng
    Ren, Wangmei
    Lin, Yunzhi
    Chen, Qinyao
    Zhang, Sijia
    Yue, Junyang
    Liu, Yongsheng
    MOLECULAR HORTICULTURE, 2023, 3 (01):
  • [39] Haplotype-resolved nonaploid genome provides insights into in vitro flowering in bamboos
    Wang, Yu-Jiao
    Guo, Cen
    Zhao, Lei
    Mao, Ling
    Hu, Xiang-Zhou
    Yang, Yi-Zhou
    Qian, Ke-Cheng
    Ma, Peng-Fei
    Guo, Zhen-Hua
    Li, De-Zhu
    HORTICULTURE RESEARCH, 2024, 11 (12)
  • [40] The haplotype-resolved genome and epigenome of the aneuploid HeLa cancer cell line
    Adey, Andrew
    Burton, Joshua N.
    Kitzman, Jacob O.
    Hiatt, Joseph B.
    Lewis, Alexandra P.
    Martin, Beth K.
    Qiu, Ruolan
    Lee, Choli
    Shendure, Jay
    NATURE, 2013, 500 (7461) : 207 - +