An index structure for pattern similarity searching in DNA microarray data

被引:2
|
作者
Wang, HX [1 ]
Perng, CS [1 ]
Fan, W [1 ]
Yu, PS [1 ]
机构
[1] IBM Corp, TJ Watson Res Ctr, Hawthorne, NY 10532 USA
来源
CSB2002: IEEE COMPUTER SOCIETY BIOINFORMATICS CONFERENCE | 2002年
关键词
D O I
10.1109/CSB.2002.1039348
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The DNA microarray technology is about to bring an explosion of gene expression data that may dwarf even the human sequencing projects. Researchers are motivated to identify genes whose expression levels rise and fall coherently under a set of experimental perturbances, that is, they exhibit fluctuation of a similar shape when conditions change. In this paper, we show that queries based on pattern correlations against large-scale microarray databases can be supported by the weighted-sequence model, an index structure designed for sequence matching. A weighted-sequence is a two-dimensional structure where each element in the sequence is associated with a weight. We transform the DNA microarray data, as well as pattern-based queries, into weighted-sequences, and use subsequence matching algorithms to retrieve from the database all genes that match the query pattern. We demonstrate, using both synthetic and real-world data sets, that our method is effective and efficient.
引用
收藏
页码:256 / 267
页数:12
相关论文
共 50 条
  • [1] A neural network-based similarity index for clustering DNA microarray data
    Sawa, T
    Ohno-Machado, L
    COMPUTERS IN BIOLOGY AND MEDICINE, 2003, 33 (01) : 1 - 15
  • [2] Similarity index for clustering DNA microarray data based on multi-weighted neuron
    Cao, WM
    ROUGH SETS, FUZZY SETS, DATA MINING, AND GRANULAR COMPUTING, PT 2, PROCEEDINGS, 2005, 3642 : 402 - 408
  • [3] Spectral similarity for analysis of DNA microarray time-series data
    Yan, Hong
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2006, 1 (02) : 150 - 161
  • [4] Subdimension-based similarity measure for DNA microarray data clustering
    Lam, Benson S. Y.
    Yan, Hong
    PHYSICAL REVIEW E, 2006, 74 (04):
  • [5] Pattern classification in DNA microarray data of multiple tumor types
    Lin, Tsun-Chen
    Liu, Ru-Sheng
    Chen, Chien-Yu
    Chao, Ya-Ting
    Chen, Shu-Yuan
    PATTERN RECOGNITION, 2006, 39 (12) : 2426 - 2438
  • [6] An index data structure for searching in metric space databases
    Uribe, Roberto
    Navarro, Gonzalo
    Barrientos, Ricardo J.
    Marin, Mauricio
    COMPUTATIONAL SCIENCE - ICCS 2006, PT 1, PROCEEDINGS, 2006, 3991 : 611 - 617
  • [7] Cluster validity for DNA microarray data using a geometrical index
    Lam, BSY
    Yan, H
    PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-9, 2005, : 3333 - 3339
  • [8] SiLi Index: Data Structure for Fast Vector Space Searching
    Herman, Ondrej
    Rychly, Pavel
    RASLAN 2019: RECENT ADVANCES IN SLAVONIC NATURAL LANGUAGE PROCESSING, 2019, : 111 - 116
  • [9] A New Measure for Similarity Searching in DNA Sequences
    Zhang, Yusen
    Chen, Wei
    MATCH-COMMUNICATIONS IN MATHEMATICAL AND IN COMPUTER CHEMISTRY, 2011, 65 (02) : 477 - 488
  • [10] THE SIMILARITY INDEX AND DNA FINGERPRINTING
    LYNCH, M
    MOLECULAR BIOLOGY AND EVOLUTION, 1990, 7 (05) : 478 - 484