共 50 条
Information Content of Sets of Biological Sequences Revisited
被引:0
|作者:
Carbone, Alessandra
[1
]
Engelen, Stefan
[1
]
机构:
[1] Univ Paris 06, INSERM, UMRS511, F-75013 Paris, France
来源:
关键词:
PROTEIN;
COMPLEXITY;
ALIGNMENT;
EVOLUTION;
D O I:
10.1007/978-3-540-88869-7_3
中图分类号:
TP39 [计算机的应用];
学科分类号:
081203 ;
0835 ;
摘要:
To analyze the information included in a pool of amino acid sequences, a first approach is to align the sequences, to estimate the probability of each amino acid to occur within columns of the aligned sequences and to combine these values through an "entropy" function whose minimum corresponds to absence of information, that is, to the case where each amino acid has the same probability to occur. Another alternative is to construct a distance tree between sequences (issued by the alignment) based on sequence similarity and to properly interpret the tree topology so to model the evolutionary property of residue conservation. We introduce the concept of "evolutionary content" of a tree of sequences, and demonstrate at what extent the more classical notion of "information content" oil sequences approximates the new measure and to what manner tree topology contributes sharper information for the detection of protein binding sites.
引用
收藏
页码:31 / 42
页数:12
相关论文