Gene Expression Data Analysis Using a Novel Approach to Biclustering Combining Discrete and Continuous Data

被引:7
|
作者
Christinat, Yann [1 ]
Wachmann, Bernd [2 ]
Zhang, Lei [2 ]
机构
[1] Ecole Polytech Fed Lausanne, Sch Comp & Commun Sci, Lab Computat Biol & Bioinformat, CH-1015 Lausanne, Switzerland
[2] Siemens Corp Res, Princeton, NJ 08540 USA
关键词
Data mining; biclustering algorithm; gene expression data; discrete data; simultaneous clustering; microarray analysis;
D O I
10.1109/TCBB.2007.70251
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Many different methods exist for pattern detection in gene expression data. In contrast to classical methods, biclustering has the ability to cluster a group of genes together with a group of conditions (replicates, set of patients, or drug compounds). However, since the problem is NP-complex, most algorithms use heuristic search functions and, therefore, might converge toward local maxima. By using the results of biclustering on discrete data as a starting point for a local search function on continuous data, our algorithm avoids the problem of heuristic initialization. Similar to Order-Preserving Submatrices (OPSM), our algorithm aims to detect biclusters whose rows and columns can be ordered such that row values are growing across the bicluster's columns and vice versa. Results have been generated on the yeast genome (Saccharomyces cerevisiae), a human cancer data set, and random data. Results on the yeast genome showed that 89 percent of the 100 biggest nonoverlapping biclusters were enriched with Gene Ontology annotations. A comparison with the methods OPSM and Iterative Signature Algorithm (ISA, a generalization of singular value decomposition) demonstrated a better efficiency when using gene and condition orders. We present results on random and real data sets that show the ability of our algorithm to capture statistically significant and biologically relevant biclusters.
引用
收藏
页码:583 / 593
页数:11
相关论文
共 50 条
  • [41] Ensemble Cuckoo Search Biclustering of the gene expression data
    Yin, Lu
    Liu, Yongguo
    2016 IEEE 15TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC), 2016, : 419 - 422
  • [42] An Efficient Weighted Biclustering Algorithm for Gene Expression Data
    Jia, Yankun
    Li, Yidong
    Liu, Wenhua
    Dong, Hairong
    2016 17TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES (PDCAT), 2016, : 336 - 341
  • [43] Seed-Based Biclustering of Gene Expression Data
    An, Jiyuan
    Liew, Alan Wee-Chung
    Nelson, Colleen C.
    PLOS ONE, 2012, 7 (08):
  • [44] Quick hierarchical biclustering on microarray gene expression data
    Ji, Liping
    Mock, Kenneth Wei-Liang
    Tan, Kian-Lee
    BIBE 2006: SIXTH IEEE SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, PROCEEDINGS, 2006, : 110 - +
  • [45] Application of simulated annealing to the biclustering of gene expression data
    Bryan, Kenneth
    Cunningham, Padraig
    Bolshakova, Nadia
    IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2006, 10 (03): : 519 - 525
  • [46] A Study of Biclustering Coherence Measures for Gene Expression Data
    Padilha, Victor A.
    de Carvalho, Andre C. P. L. F.
    2018 7TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 2018, : 546 - 551
  • [47] Biclustering gene expression data by an improved optimal algorithm
    Wang, MingQian
    Tian, Wei
    Kang, Hao
    Gao, WenJu
    MECHATRONICS AND INDUSTRIAL INFORMATICS, PTS 1-4, 2013, 321-324 : 2223 - 2226
  • [48] Biclustering of expression data using simulated annealing
    Bryan, K
    Cunningham, P
    Bolshakova, N
    18TH IEEE SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, PROCEEDINGS, 2005, : 383 - 388
  • [49] Evaluation of Plaid Models in Biclustering of Gene Expression Data
    Majd, Hamid Alavi
    Shahsavari, Soodeh
    Baghestani, Ahmad Reza
    Tabatabaei, Seyyed Mohammad
    Bashi, Naghme Khadem
    Tavirani, Mostafa Rezaei
    Hamidpour, Mohsen
    SCIENTIFICA, 2016, 2016
  • [50] GENE EXPRESSION DATA CLASSIFICATION AND PATTERN ANALYSIS USING DATA DRIVEN APPROACH
    Ramisa, Aiman Jabeen
    Hossain, Ananna
    Islam, S. K. Md Injamul
    Swadesh, Ponuel Mollah
    Islam, Md Toushif
    Rahman, Md Anisur
    Parvez, Mohammad Zavid
    PROCEEDINGS OF 2021 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2021, : 82 - 90