Detection of unrelated proteins in sequences multiple alignments by using predicted secondary structures

被引:19
|
作者
Errami, M [1 ]
Geourjon, C [1 ]
Deléage, G [1 ]
机构
[1] Inst Biol & Chim Prot, Pole Bioinformat Lyonnais, CNRS, UMR 5086, F-69367 Lyon 07, France
关键词
D O I
10.1093/bioinformatics/btg016
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Multiple sequence alignments are essential tools for establishing the homology relations between proteins. Essential amino acids for the function and/or the structure are generally conserved, thus providing key arguments to help in protein characterization. However for distant proteins, it is more difficult to establish, in a reliable way, the homology relations that may exist between them. In this article, we show that secondary structure prediction is a valuable way to validate protein families at low identity rate. Results: We show that the analysis of the secondary structures compatibility is a reliable way to discard non-related proteins in low identity multiple alignment.
引用
收藏
页码:506 / 512
页数:7
相关论文
共 50 条
  • [21] SPEM: improving multiple sequence alignment with sequence profiles and predicted secondary structures
    Zhou, HY
    Zhou, YQ
    BIOINFORMATICS, 2005, 21 (18) : 3615 - 3621
  • [22] Using the T-Coffee package to build multiple sequence alignments of protein, RNA, DNA sequences and 3D structures
    Jean-Francois Taly
    Cedrik Magis
    Giovanni Bussotti
    Jia-Ming Chang
    Paolo Di Tommaso
    Ionas Erb
    Jose Espinosa-Carrasco
    Carsten Kemena
    Cedric Notredame
    Nature Protocols, 2011, 6 : 1669 - 1682
  • [23] MERLIN: Identifying Inaccuracies in Multiple Sequence Alignments Using Object Detection
    Khodji, Hiba
    Herbay, Lucille
    Collet, Pierre
    Thompson, Julie
    Jeannin-Girardon, Anne
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2022, PART I, 2022, 646 : 192 - 203
  • [24] HOMOLOGY OF THE PREDICTED SECONDARY STRUCTURES OF THE N-TERMINAL FRAGMENTS OF PRE-PROTEINS
    TLOMAK, P
    NOWAK, K
    ACTA BIOCHIMICA POLONICA, 1981, 28 (3-4) : 253 - 265
  • [25] Using the T-Coffee package to build multiple sequence alignments of protein, RNA, DNA sequences and 3D structures
    Taly, Jean-Francois
    Magis, Cedrik
    Bussotti, Giovanni
    Chang, Jia-Ming
    Di Tommaso, Paolo
    Erb, Ionas
    Espinosa-Carrasco, Jose
    Kemena, Carsten
    Notredame, Cedric
    NATURE PROTOCOLS, 2011, 6 (11) : 1669 - 1682
  • [26] TurboFold: Iterative probabilistic estimation of secondary structures for multiple RNA sequences
    Harmanci, Arif O.
    Sharma, Gaurav
    Mathews, David H.
    BMC BIOINFORMATICS, 2011, 12
  • [27] Multilign: an algorithm to predict secondary structures conserved in multiple RNA sequences
    Xu, Zhenjiang
    Mathews, David H.
    BIOINFORMATICS, 2011, 27 (05) : 626 - 632
  • [28] TurboFold: Iterative probabilistic estimation of secondary structures for multiple RNA sequences
    Arif O Harmanci
    Gaurav Sharma
    David H Mathews
    BMC Bioinformatics, 12
  • [29] Approximation algorithms for optimal RNA secondary structures common to multiple sequences
    Tamura, Takeyuki
    Akutsu, Tatsuya
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2007, E90A (05) : 917 - 923
  • [30] INFOGENE: a database of known gene structures and predicted genes and proteins in sequences of genome sequencing projects
    Solovyev, VV
    Salamov, AA
    NUCLEIC ACIDS RESEARCH, 1999, 27 (01) : 248 - 250