Assembling contigs in draft genomes using reversals and block-interchanges

被引:6
|
作者
Li, Chi-Long [1 ]
Chen, Kun-Tze [1 ]
Lu, Chin Lung [1 ]
机构
[1] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu 30013, Taiwan
来源
BMC BIOINFORMATICS | 2013年 / 14卷
关键词
ALGORITHM;
D O I
10.1186/1471-2105-14-S5-S9
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The techniques of next generation sequencing allow an increasing number of draft genomes to be produced rapidly in a decreasing cost. However, these draft genomes usually are just partially sequenced as collections of unassembled contigs, which cannot be used directly by currently existing algorithms for studying their genome rearrangements and phylogeny reconstruction. In this work, we study the one-sided block (or contig) ordering problem with weighted reversal and block-interchange distance. Given a partially assembled genome pi and a completely assembled genome sigma, the problem is to find an optimal ordering to assemble (i.e., order and orient) the contigs of pi such that the rearrangement distance measured by reversals and block-interchanges (also called generalized transpositions) with the weight ratio 1: 2 between the assembled contigs of pi and sigma is minimized. In addition to genome rearrangements and phylogeny reconstruction, the one-sided block ordering problem particularly has a useful application in genome resequencing, because its algorithms can be used to assemble the contigs of a draft genome pi based on a reference genome sigma. By using permutation groups, we design an efficient algorithm to solve this one-sided block ordering problem in O(delta n) time, where n is the number of genes or markers and delta is the number of used reversals and block-interchanges. We also show that the assembly of the partially assembled genome can be done in O(n) time and its weighted rearrangement distance from the completely assembled genome can be calculated in advance in O(n) time. Finally, we have implemented our algorithm into a program and used some simulated datasets to compare its accuracy performance to a currently existing similar tool, called SIS that was implemented by a heuristic algorithm that considers only reversals, on assembling the contigs in draft genomes based on their reference genomes. Our experimental results have shown that the accuracy performance of our program is better than that of SIS, when the number of reversals and transpositions involved in the rearrangement events between the complete genomes of pi and sigma is increased. In particular, if there are more transpositions involved in the rearrangement events, then the gap of accuracy performance between our program and SIS is increasing.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] High quality draft sequences for prokaryotic genomes using a mix of new sequencing technologies
    Jean-Marc Aury
    Corinne Cruaud
    Valérie Barbe
    Odile Rogier
    Sophie Mangenot
    Gaelle Samson
    Julie Poulain
    Véronique Anthouard
    Claude Scarpelli
    François Artiguenave
    Patrick Wincker
    BMC Genomics, 9
  • [42] High quality draft sequences for prokaryotic genomes using a mix of new sequencing technologies
    Aury, Jean-Marc
    Cruaud, Corinne
    Barbe, Valerie
    Rogier, Odile
    Mangenot, Sophie
    Samson, Gaelle
    Poulain, Julie
    Anthouard, Veronique
    Scarpelli, Claude
    Artiguenave, Francois
    Wincker, Patrick
    BMC GENOMICS, 2008, 9 (1)
  • [43] SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information
    Marten Boetzer
    Walter Pirovano
    BMC Bioinformatics, 15
  • [44] SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information
    Boetzer, Marten
    Pirovano, Walter
    BMC BIOINFORMATICS, 2014, 15
  • [45] Characterization of the Xanthomonas translucens Complex Using Draft Genomes, Comparative Genomics, Phylogenetic Analysis, and Diagnostic LAMP Assays
    Langlois, Paul A.
    Snelling, Jacob
    Hamilton, John P.
    Bragard, Claude
    Koebnik, Ralf
    Verdier, Valerie
    Triplett, Lindsay R.
    Blom, Jochen
    Tisserat, Ned A.
    Leach, Jan E.
    PHYTOPATHOLOGY, 2017, 107 (05) : 519 - 527
  • [46] Modular design and functionalization of enzymatic hydrogels using a self-assembling protein building block
    Lim, Samuel
    Glover, Dominic
    Clark, Douglas
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2017, 253
  • [47] Reconstructing Draft Genomes Using Genome Resolved Metagenomics Reveal Arsenic Metabolizing Genes and Secondary Metabolites in Fresh Water Lake in Eastern India
    Ghosh, Samrat
    Sarangi, Aditya Narayan
    Mukherjee, Mayuri
    Singh, Deeksha
    Madhavi, Madduluri
    Tripathy, Sucheta
    BIOINFORMATICS AND BIOLOGY INSIGHTS, 2021, 15
  • [48] Preparation of surface-block dendrimer by using air-water interface and its self-assembling property on gold
    Higashi, N
    Koga, T
    Kitamatsu, M
    Niwa, M
    KOBUNSHI RONBUNSHU, 2000, 57 (10) : 659 - 664
  • [49] Are we there yet? Benchmarking low-coverage nanopore long-read sequencing for the assembling of mitochondrial genomes using the vulnerable silky shark Carcharhinus falciformis
    J. Antonio Baeza
    F. J. García-De León
    BMC Genomics, 23
  • [50] Are we there yet? Benchmarking low-coverage nanopore long-read sequencing for the assembling of mitochondrial genomes using the vulnerable silky shark Carcharhinus falciformis
    Antonio Baeza, J.
    Garcia-De Leon, F. J.
    BMC GENOMICS, 2022, 23 (01)