CSI: Clustered segment indexing for efficient approximate searching on the secondary structure of protein sequences

被引：0

作者：

Seo, M ^{[1
]}

Park, S ^{[1
]}

Won, JI ^{[1
]}

机构：

[1] Yonsei Univ, Dept Comp Sci, Seoul 120749, South Korea

来源：

FOUNDATIONS OF INTELLIGENT SYSTEMS, PROCEEDINGS | 2005年 / 3488卷

关键词：

indexing method; secondary structure of proteins; approximate searching;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Approximate searching on the primary structure (i.e., amino acid arrangement) of protein sequences is an essential part in predicting the functions and evolutionary histories of proteins. However, because proteins distant in an evolutionary history do not conserve amino acid residue arrangements, approximate searching on proteins' secondary structure is quite important in finding out distant homology. In this paper, we propose an indexing scheme for efficient approximate searching on the secondary structure of protein sequences which can be easily implemented in RDBMS. Exploiting the concept of clustering and lookahead, the proposed indexing scheme processes three types of secondary structure queries (i.e., exact match, range match, and wildcard match) very quickly. To evaluate the performance of the proposed method, we conducted extensive experiments using a set of actual protein sequences. According to the experimental results, the proposed method was proved to be faster than the existing indexing methods up to 6.3 times in exact match, 3.3 times in range match, and 1.5 times in wildcard match, respectively.

引用

页码：237 / 247

页数：11

共 50 条

[1] Towards efficient searching on the secondary structure of protein sequences
Seo, Minkoo
Park, Sanghyun
Won, Jung-Im
FUNDAMENTA INFORMATICAE, 2007, 78 (04) : 525 - 542
[2] cTraj: efficient indexing and searching of sequences containing multiple moving objects
Al Aghbari, Zaher
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2012, 39 (01) : 1 - 28
[3] cTraj: efficient indexing and searching of sequences containing multiple moving objects
Zaher Al Aghbari
Journal of Intelligent Information Systems, 2012, 39 : 1 - 28
[4] PSISA: An Algorithm for Indexing and Searching Protein Structure using Suffix Arrays
Gharib, Tarek F.
Salah, Ahmed
Salem, Abdel-Badeeh M.
PROCEEDINGS OF THE 12TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS , PTS 1-3: NEW ASPECTS OF COMPUTERS, 2008, : 775 - +
[5] Efficient protein structure search using indexing methods
Sungchul Kim
Lee Sael
Hwanjo Yu
BMC Medical Informatics and Decision Making, 13
[6] Efficient protein structure search using indexing methods
Kim, Sungchul
Sael, Lee
Yu, Hwanjo
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2013, 13
[7] Aligning Protein Sequences with Predicted Secondary Structure
Kececioglu, John
Kim, Eagu
Wheeler, Travis
JOURNAL OF COMPUTATIONAL BIOLOGY, 2010, 17 (03) : 561 - 580
[8] Modeling secondary structures and secondary structure linkages of protein sequences
Marshall, R.
IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2010), 2010,
[9] Graphical representations for protein secondary structure sequences and their application
Liu, Na
Wang, Tianming
CHEMICAL PHYSICS LETTERS, 2007, 435 (1-3) : 127 - 131
[10] Condensed representations of protein secondary structure sequences and their application
Feng, Jie
Wang, Tian-ming
JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2008, 25 (06): : 621 - 628

← 1 2 3 4 5 →