Progressive multiple sequence alignments from triplets

被引:17
|
作者
Kruspe, Matthias
Stadler, Peter F.
机构
[1] Univ Leipzig, Bioinformat Grp, Dept Comp Sci, D-04107 Leipzig, Germany
[2] Univ Leipzig, Bioinformat Grp, Interdisciplinary Ctr Bioinformat, D-04107 Leipzig, Germany
[3] Fraunhofer Inst Zelltherapie & Immunol IZI, D-04103 Leipzig, Germany
[4] Univ Vienna, Inst Theoret Chem, A-1090 Vienna, Austria
[5] Santa Fe Inst, Santa Fe, NM 87501 USA
关键词
D O I
10.1186/1471-2105-8-254
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The quality of progressive sequence alignments strongly depends on the accuracy of the individual pairwise alignment steps since gaps that are introduced at one step cannot be removed at later aggregation steps. Adjacent insertions and deletions necessarily appear in arbitrary order in pairwise alignments and hence form an unavoidable source of errors. Research: Here we present a modified variant of progressive sequence alignments that addresses both issues. Instead of pairwise alignments we use exact dynamic programming to align sequence or profile triples. This avoids a large fractions of the ambiguities arising in pairwise alignments. In the subsequent aggregation steps we follow the logic of the Neighbor- Net algorithm, which constructs a phylogenetic network by step- wisely replacing triples by pairs instead of combining pairs to singletons. To this end the three- way alignments are subdivided into two partial alignments, at which stage all- gap columns are naturally removed. This alleviates the '' once a gap, always a gap '' problem of progressive alignment procedures. Conclusion: The three- way Neighbor- Net based alignment program aln3nn is shown to compare favorably on both protein sequences and nucleic acids sequences to other progressive alignment tools. In the latter case one easily can include scoring terms that consider secondary structure features. Overall, the quality of resulting alignments in general exceeds that of clustalw or other multiple alignments tools even though our software does not included heuristics for context dependent ( mis) match scores.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] ADVANTAGES OF USING MULTIPLE SEQUENCE ALIGNMENTS OVER PAIRWISE ALIGNMENTS WHEN SEQUENCE SIMILARITY IS LOW
    BABBITT, PC
    DUNAWAYMARIANO, D
    KENYON, GL
    BIOCHEMISTRY, 1992, 31 (07) : 2198 - 2198
  • [22] Embedding strategies for effective use of information from multiple sequence alignments
    Henikoff, S
    Henikoff, JG
    PROTEIN SCIENCE, 1997, 6 (03) : 698 - 705
  • [23] IP-MSA: Independent Order of Progressive Multiple Sequence Alignments Using Different Substitution Matrices
    Boraik, Aziz Nasser
    Abdullah, Rosni
    Venkat, Ibrahim
    INTERNATIONAL CONFERENCE ON QUANTITATIVE SCIENCES AND ITS APPLICATIONS (ICOQSIA 2014), 2014, 1635 : 403 - 409
  • [24] ALSCRIPT - A TOOL TO FORMAT MULTIPLE SEQUENCE ALIGNMENTS
    BARTON, GJ
    PROTEIN ENGINEERING, 1993, 6 (01): : 37 - 40
  • [25] Source Coding Scheme for Multiple Sequence Alignments
    Hanus, Pavol
    Dingel, Janis
    Chalkidis, Georg
    Hagenauer, Joachim
    DCC 2009: 2009 DATA COMPRESSION CONFERENCE, PROCEEDINGS, 2008, : 183 - 192
  • [26] COMPENSATING CHANGES IN PROTEIN MULTIPLE SEQUENCE ALIGNMENTS
    TAYLOR, WR
    HATRICK, K
    PROTEIN ENGINEERING, 1994, 7 (03): : 341 - 348
  • [27] A minimum reporting standard for multiple sequence alignments
    Wong, Thomas K. F.
    Kalyaanamoorthy, Subha
    Meusemann, Karen
    Yeates, David K.
    Misof, Bernhard
    Jermiin, Lars S.
    NAR GENOMICS AND BIOINFORMATICS, 2020, 2 (02)
  • [28] Sequence Diversity Diagram for comparative analysis of multiple sequence alignments
    Ryo Sakai
    Jan Aerts
    BMC Proceedings, 8 (Suppl 2)
  • [29] State of the art: refinement of multiple sequence alignments
    Saikat Chakrabarti
    Christopher J Lanczycki
    Anna R Panchenko
    Teresa M Przytycka
    Paul A Thiessen
    Stephen H Bryant
    BMC Bioinformatics, 7
  • [30] The impact of single substitutions on multiple sequence alignments
    Klaere, Steffen
    Gesell, Tanja
    von Haeseler, Arndt
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2008, 363 (1512) : 4041 - 4047