AsmMix: an efficient haplotype-resolved hybrid de novo genome assembling pipeline

被引:0
|
作者
Liu, Chao [1 ,2 ]
Wu, Pei [1 ,2 ]
Wu, Xue [2 ]
Zhao, Xia [3 ]
Chen, Fang [3 ]
Cheng, Xiaofang [3 ]
Zhu, Hongmei [1 ,2 ]
Wang, Ou [2 ]
Xu, Mengyang [2 ,4 ]
机构
[1] BGI, Tianjin, Peoples R China
[2] BGI Res, Shenzhen, Peoples R China
[3] MGI Tech, Shenzhen, Peoples R China
[4] BGI Res, Qingdao, Peoples R China
基金
中国国家自然科学基金;
关键词
long reads; bioinformatics; de novo; genome assembly; haplotype; hybrid; LONG; ACCURATE; READS;
D O I
10.3389/fgene.2024.1421565
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Accurate haplotyping facilitates distinguishing allele-specific expression, identifying cis-regulatory elements, and characterizing genomic variations, which enables more precise investigations into the relationship between genotype and phenotype. Recent advances in third-generation single-molecule long read and synthetic co-barcoded read sequencing techniques have harnessed long-range information to simplify the assembly graph and improve assembly genomic sequence. However, it remains methodologically challenging to reconstruct the complete haplotypes due to high sequencing error rates of long reads and limited capturing efficiency of co-barcoded reads. We here present a pipeline, AsmMix, for generating both contiguous and accurate diploid genomes. It first assembles co-barcoded reads to generate accurate haplotype-resolved assemblies that may contain many gaps, while the long-read assembly is contiguous but susceptible to errors. Then two assembly sets are integrated into haplotype-resolved assemblies with reduced misassembles. Through extensive evaluation on multiple synthetic datasets, AsmMix consistently demonstrates high precision and recall rates for haplotyping across diverse sequencing platforms, coverage depths, read lengths, and read accuracies, significantly outperforming other existing tools in the field. Furthermore, we validate the effectiveness of our pipeline using a human whole genome dataset (HG002), and produce highly contiguous, accurate, and haplotype-resolved assemblies. These assemblies are evaluated using the GIAB benchmarks, confirming the accuracy of variant calling. Our results demonstrate that AsmMix offers a straightforward yet highly efficient approach that effectively leverages both long reads and co-barcoded reads for haplotype-resolved assembly.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Chromosome-scale and haplotype-resolved genome assembly of Populus trichocarpa
    Gao, Wentao
    Wang, Sui
    Jiang, Tao
    Hu, Heng
    Gao, Runtian
    Zhou, Murong
    Wang, Guohua
    HORTICULTURE RESEARCH, 2025, 12 (04)
  • [42] Haplotype-resolved 3D chromatin architecture of the hybrid pig
    Lin, Yu
    Li, Jing
    Gu, Yiren
    Jin, Long
    Bai, Jingyi
    Zhang, Jiaman
    Wang, Yujie
    Liu, Pengliang
    Long, Keren
    He, Mengnan
    Li, Diyan
    Liu, Can
    Han, Ziyin
    Zhang, Yu
    Li, Xiaokai
    Zeng, Bo
    Lu, Lu
    Kong, Fanli
    Sun, Ying
    Fan, Yongliang
    Wang, Xun
    Wang, Tao
    Jiang, An'an
    Ma, Jideng
    Shen, Linyuan
    Zhu, Li
    Jiang, Yanzhi
    Tang, Guoqing
    Fan, Xiaolan
    Liu, Qingyou
    Li, Hua
    Wang, Jinyong
    Chen, Li
    Ge, Liangpeng
    Li, Xuewei
    Tang, Qianzi
    Li, Mingzhou
    GENOME RESEARCH, 2024, 34 (02) : 310 - 325
  • [43] Haplotype-resolved de novo assembly revealed unique characteristics of alternative lengthening of telomeres in mouse embryonic stem cells
    Lee, Hyunji
    Niida, Hiroyuki
    Sung, Sanghyun
    Lee, Junho
    NUCLEIC ACIDS RESEARCH, 2024, 52 (20) : 12456 - 12474
  • [44] Haplotype-resolved genome of a papeda provides insights into the geographical origin and evolution of Citrus
    Wang, Fusheng
    Wang, Shaohua
    Wu, Yilei
    Jiang, Dong
    Yi, Qian
    Zhang, Manman
    Yu, Hong
    Yuan, Xiaoyu
    Li, Mingzhu
    Li, Guijie
    Cheng, Yujiao
    Feng, Jipeng
    Wang, Xiaoli
    Cheng, Chunzhen
    Zhu, Shiping
    Liu, Renyi
    JOURNAL OF INTEGRATIVE PLANT BIOLOGY, 2025, 67 (02) : 276 - 293
  • [45] A new haplotype-resolved turkey genome to enable turkey genetics and genomics research
    Barros, Carolina P.
    Derks, Martijn F. L.
    Mohr, Jeff
    Wood, Benjamin J.
    Crooijmans, Richard P. M. A.
    Megens, Hendrik-Jan
    Bink, Marco C. A. M.
    Groenen, Martien A. M.
    GIGASCIENCE, 2023, 12
  • [46] Chromosome-scale and haplotype-resolved genome assembly of a tetraploid potato cultivar
    Sun, Hequan
    Jiao, Wen-Biao
    Campoy, Jose A.
    Krause, Kristin
    Goel, Manish
    Folz-Donahue, Kat
    Kukat, Christian
    Huettel, Bruno
    Schneeberger, Korbinian
    NATURE GENETICS, 2022, 54 (03) : 342 - +
  • [47] Haplotype-resolved chromosomal-level assembly of wasabi (Eutrema japonicum) genome
    Hiroyuki Tanaka
    Tatsuki Hori
    Shohei Yamamoto
    Atsushi Toyoda
    Kentaro Yano
    Kyoko Yamane
    Takehiko Itoh
    Scientific Data, 10
  • [48] A haplotype-resolved genome reveals the genetic basis of spine formation in Atelerix albiventris
    Libo Jiang
    Jianing Xu
    Mengyuan Zhu
    Zhongfan Lv
    Zemin Ning
    Fengtang Yang
    Journal of Genetics and Genomics, 2024, 51 (12) : 1529 - 1532
  • [49] A new haplotype-resolved turkey genome to enable turkey genetics and genomics research
    Barros, Carolina P.
    Derks, Martijn F. L.
    Mohr, Jeff
    Wood, Benjamin J.
    Crooijmans, Richard P. M. A.
    Megens, Hendrik-Jan
    Bink, Marco C. A. M.
    Groenen, Martien A. M.
    GIGASCIENCE, 2023, 12
  • [50] Haplotype-resolved and chromosome-level genome assembly of Colorado potato beetle
    Ziqi Ye
    Ruirui Lu
    Chao Li
    Doudou Yang
    Zhuozhen Zeng
    Weichao Lin
    Jie Cheng
    Zhongmin Yang
    Li Wang
    Yulin Gao
    Sanwen Huang
    Xingtan Zhang
    Suhua Li
    Journal of Genetics and Genomics, 2023, 50 (07) : 532 - 535