Information Content of Sets of Biological Sequences Revisited

被引：0

作者：

Carbone, Alessandra ^{[1
]}

Engelen, Stefan ^{[1
]}

机构：

[1] Univ Paris 06, INSERM, UMRS511, F-75013 Paris, France

来源：

ALGORITHMIC BIOPROCESSES | 2009年

关键词：

PROTEIN; COMPLEXITY; ALIGNMENT; EVOLUTION;

D O I：

10.1007/978-3-540-88869-7_3

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

To analyze the information included in a pool of amino acid sequences, a first approach is to align the sequences, to estimate the probability of each amino acid to occur within columns of the aligned sequences and to combine these values through an "entropy" function whose minimum corresponds to absence of information, that is, to the case where each amino acid has the same probability to occur. Another alternative is to construct a distance tree between sequences (issued by the alignment) based on sequence similarity and to properly interpret the tree topology so to model the evolutionary property of residue conservation. We introduce the concept of "evolutionary content" of a tree of sequences, and demonstrate at what extent the more classical notion of "information content" oil sequences approximates the new measure and to what manner tree topology contributes sharper information for the detection of protein binding sites.

引用

页码：31 / 42

页数：12

共 50 条

[41] SUPERCOMPLEMENTARY SETS OF SEQUENCES
BUDISIN, SZ
ELECTRONICS LETTERS, 1987, 23 (10) : 504 - 506
[42] Achievement Sets of Sequences
Jones, Rafe
AMERICAN MATHEMATICAL MONTHLY, 2011, 118 (06): : 508 - 521
[43] Difference Sets and Sequences
Ma, Siu Lun
BULLETIN OF THE MALAYSIAN MATHEMATICAL SCIENCES SOCIETY, 2012, 35 (2A) : 481 - 486
[44] Data management and extraction of biological information from large data sets
Mount, David
IN VITRO CELLULAR & DEVELOPMENTAL BIOLOGY-ANIMAL, 2008, 44 : S3 - S4
[45] COMPLEMENTARY SETS OF SEQUENCES
TSENG, CC
LIU, CL
IEEE TRANSACTIONS ON INFORMATION THEORY, 1972, 18 (05) : 644 - +
[46] SETS OF COMPLEMENTARY SEQUENCES
SARWATE, DV
ELECTRONICS LETTERS, 1983, 19 (18) : 711 - 712
[47] REPRESENTATIONS OF SEQUENCES OF SETS
EVERETT, CJ
WHAPLES, G
AMERICAN JOURNAL OF MATHEMATICS, 1949, 71 (02) : 287 - 293
[48] INFORMATION-CONTENT OF BINDING-SITES ON NUCLEOTIDE-SEQUENCES
SCHNEIDER, TD
STORMO, GD
GOLD, L
EHRENFEUCHT, A
JOURNAL OF MOLECULAR BIOLOGY, 1986, 188 (03) : 415 - 431
[49] INFORMATION-CONTENT IN FINITE SEQUENCES - COMMUNICATION BETWEEN DRAGONFLY LARVAE
ROWE, GW
HARVEY, IF
JOURNAL OF THEORETICAL BIOLOGY, 1985, 116 (02) : 275 - 290
[50] How noise thresholds affect the information content of stellar flare sequences
Rivera, Elmer C.
Johnson, Jay R.
Homan, Jonathan
Wing, Simon
ASTRONOMY & ASTROPHYSICS, 2023, 670

← 1 2 3 4 5 →