BranchClust: a phylogenetic algorithm for selecting gene families

被引:35
|
作者
Poptsova, Maria S. [1 ]
Gogarten, J. Peter [1 ]
机构
[1] Univ Connecticut, Dept Mol & Cell Biol, Storrs, CT 06269 USA
关键词
D O I
10.1186/1471-2105-8-120
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Automated methods for assembling families of orthologous genes include those based on sequence similarity scores and those based on phylogenetic approaches. The first are easy to automate but usually they do not distinguish between paralogs and orthologs or have restriction on the number of taxa. Phylogenetic methods often are based on reconciliation of a gene tree with a known rooted species tree; a limitation of this approach, especially in case of prokaryotes, is that the species tree is often unknown, and that from the analyses of single gene families the branching order between related organisms frequently is unresolved. Results: Here we describe an algorithm for the automated selection of orthologous genes that recognizes orthologous genes from different species in a phylogenetic tree for any number of taxa. The algorithm is capable of distinguishing complete ( containing all taxa) and incomplete ( not containing all taxa) families and recognizes in- and outparalogs. The BranchClust algorithm is implemented in Perl with the use of the BioPerl module for parsing trees and is freely available at http://bioinformatics.org/branchclust. Conclusion: BranchClust outperforms the Reciprocal Best Blast hit method in selecting more sets of putatively orthologous genes. In the test cases examined, the correctness of the selected families and of the identified in- and outparalogs was confirmed by inspection of the pertinent phylogenetic trees.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] Placing paleopolyploidy in relation to taxon divergence: A phylogenetic analysis in legumes using 39 gene families
    Pfeil, BE
    Schlueter, JA
    Shoemaker, RC
    Doyle, JJ
    SYSTEMATIC BIOLOGY, 2005, 54 (03) : 441 - 454
  • [42] Phylogenetic distribution of plant snoRNA families
    Deblina Patra Bhattacharya
    Sebastian Canzler
    Stephanie Kehr
    Jana Hertel
    Ivo Grosse
    Peter F. Stadler
    BMC Genomics, 17
  • [43] Selecting families for successful insulin pump therapy
    Williams, LB
    Storch, EA
    Lewin, AB
    Geffken, GR
    Silverstein, JH
    JOURNAL OF PEDIATRICS, 2005, 146 (05): : 713 - 713
  • [44] Phylogenetic distribution of plant snoRNA families
    Bhattacharya, Deblina Patra
    Canzler, Sebastian
    Kehr, Stephanie
    Hertel, Jana
    Grosse, Ivo
    Stadler, Peter F.
    BMC GENOMICS, 2016, 17
  • [45] An Efficient Feature Selection Algorithm for Gene Families Using NMF and ReliefF
    Liu, Kai
    Chen, Qi
    Huang, Guo-Hua
    GENES, 2023, 14 (02)
  • [46] A phylogenomics approach for selecting robust sets of phylogenetic markers
    Capella-Gutierrez, Salvador
    Kauff, Frank
    Gabaldon, Toni
    NUCLEIC ACIDS RESEARCH, 2014, 42 (07)
  • [47] On selecting an algorithm for fuzzy optimization
    Untiedt, Elizabeth
    Lodwick, Weldon
    FOUNDATIONS OF FUZZY LOGIC AND SOFT COMPUTING, PROCEEDINGS, 2007, 4529 : 371 - +
  • [48] Selecting a Distributed Agreement Algorithm
    Walters, Robert John
    Henderson, Peter
    Crouch, Stephen
    APPLIED COMPUTING 2007, VOL 1 AND 2, 2007, : 586 - 587
  • [49] Selecting problems for algorithm evaluation
    Goldberg, AV
    ALGORITHM ENGINEERING, 1999, 1668 : 1 - 11
  • [50] On algorithm for estimation of selecting core
    Ahn, YJ
    Kim, M
    Bang, YC
    Choo, H
    COMPUTATIONAL SCIENCE - ICCS 2005, PT 3, 2005, 3516 : 796 - 800