CSI: Clustered segment indexing for efficient approximate searching on the secondary structure of protein sequences

被引:0
|
作者
Seo, M [1 ]
Park, S [1 ]
Won, JI [1 ]
机构
[1] Yonsei Univ, Dept Comp Sci, Seoul 120749, South Korea
关键词
indexing method; secondary structure of proteins; approximate searching;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Approximate searching on the primary structure (i.e., amino acid arrangement) of protein sequences is an essential part in predicting the functions and evolutionary histories of proteins. However, because proteins distant in an evolutionary history do not conserve amino acid residue arrangements, approximate searching on proteins' secondary structure is quite important in finding out distant homology. In this paper, we propose an indexing scheme for efficient approximate searching on the secondary structure of protein sequences which can be easily implemented in RDBMS. Exploiting the concept of clustering and lookahead, the proposed indexing scheme processes three types of secondary structure queries (i.e., exact match, range match, and wildcard match) very quickly. To evaluate the performance of the proposed method, we conducted extensive experiments using a set of actual protein sequences. According to the experimental results, the proposed method was proved to be faster than the existing indexing methods up to 6.3 times in exact match, 3.3 times in range match, and 1.5 times in wildcard match, respectively.
引用
收藏
页码:237 / 247
页数:11
相关论文
共 50 条
  • [21] PROSEARCH - FAST SEARCHING OF PROTEIN SEQUENCES WITH REGULAR EXPRESSION PATTERNS RELATED TO PROTEIN-STRUCTURE AND FUNCTION
    KOLAKOWSKI, LF
    LEUNISSEN, JAM
    SMITH, JE
    BIOTECHNIQUES, 1992, 13 (06) : 919 - 921
  • [22] Can computationally designed protein sequences improve secondary structure prediction?
    Bondugula, Rajkumar
    Wallqvist, Anders
    Lee, Michael S.
    PROTEIN ENGINEERING DESIGN & SELECTION, 2011, 24 (05): : 455 - 461
  • [23] An efficient and flexible scanning of databases of protein secondary structureswith the segment index and multithreaded alignment
    Dariusz Mrozek
    Bartek Socha
    Stanisław Kozielski
    Bożena Małysiak-Mrozek
    Journal of Intelligent Information Systems, 2016, 46 : 213 - 233
  • [24] Prediction of protein secondary structure from PDB structure information based on Sequence segments homology searching
    Tatsumoto, S
    Satou, K
    Konagaya, A
    METMBS '04: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MATHEMATICS AND ENGINEERING TECHNIQUES IN MEDICINE AND BIOLOGICAL SCIENCES, 2004, : 250 - 255
  • [25] Efficient characterization of protein secondary structure in terms of screw motions
    Kneller, GR
    Calligari, P
    ACTA CRYSTALLOGRAPHICA SECTION D-STRUCTURAL BIOLOGY, 2006, 62 : 302 - 311
  • [26] Protein Binding to Simple Repetitive Sequences Depends on DNA Secondary Structure(s)
    W. Mäueler
    G. Bassili
    C. Epplen
    H.-g. Keyl
    J. T. Epplen
    Chromosome Research, 1999, 7 : 163 - 166
  • [27] 2D representation of protein secondary structure sequences and its applications
    Liu, Liwei
    Wang, Tianming
    JOURNAL OF COMPUTATIONAL CHEMISTRY, 2006, 27 (11) : 1119 - 1124
  • [28] Protein binding to simple repetitive sequences depends on DNA secondary structure(s)
    Mäueler, W
    Bassili, G
    Epplen, C
    Keyl, HG
    Epplen, JT
    CHROMOSOME RESEARCH, 1999, 7 (03) : 163 - 166
  • [29] An approach to improving multiple alignments of protein sequences using predicted secondary structure
    Jennings, AJ
    Edge, CM
    Sternberg, MJE
    PROTEIN ENGINEERING, 2001, 14 (04): : 227 - 231
  • [30] 2D representation of protein secondary structure sequences and its applications
    Department of Applied Mathematics, Dalian University of Technology, Dalian 116024, China
    J. Comput. Chem., 2006, 11 (1119-1124):