A linear time biclustering algorithm for time series gene expression data

被引:0
|
作者
Madeira, SC
Oliveira, AL
机构
[1] INESC, ID, Lisbon, Portugal
[2] Univ Tecn Lisboa, IST, Lisbon, Portugal
[3] Univ Beira Interior, Covilha, Portugal
来源
关键词
D O I
暂无
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Several non-supervised machine learning methods have been used in the analysis of gene expression data obtained from microarray experiments. Recently, biclustering, a non-supervised approach that performs simultaneous clustering on the row and column dimensions of the data matrix, has been shown to be remarkably effective in a variety of applications. The goal of biclustering is to find subgroups of genes and subgroups of conditions, where the genes exhibit highly correlated behaviors. In the most common settings, biclustering is an NP-complete problem, and heuristic approaches are used to obtain sub-optimal solutions using reasonable computational resources. In this work, we examine a particular setting of the problem, where we are concerned with finding biclusters in time series expression data. In this context, we are interested in finding biclusters with consecutive columns. For this particular version of the problem, we propose an algorithm that finds and reports all relevant biclusters in time linear on the size of the data matrix. This complexity is obtained by manipulating a discretized version of the matrix and by using string processing techniques based on suffix trees. We report results in both synthetic and real data that show the effectiveness of the approach.
引用
收藏
页码:39 / 52
页数:14
相关论文
共 50 条
  • [31] Biclustering of gene expression data based on hybrid genetic algorithm
    Bagyamani, J.
    Thangavel, K.
    Rathipriya, R.
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2013, 5 (04) : 333 - 350
  • [32] MR-GABiT: Map Reduce based Genetic Algorithm for Biclustering Time Series Data
    Gowri, R.
    Rathipriya, R.
    2016 IEEE INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER APPLICATIONS (ICACA), 2016, : 381 - 387
  • [33] On Biclustering of Gene Expression Data
    Mukhopadhyay, Anirban
    Maulik, Ujjwal
    Bandyopadhyay, Sanghamitra
    CURRENT BIOINFORMATICS, 2010, 5 (03) : 204 - 216
  • [34] On Biclustering of Gene Expression Data
    Mounir, Mahmoud
    Hamdy, Mohamed
    2015 IEEE SEVENTH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND INFORMATION SYSTEMS (ICICIS), 2015, : 641 - 648
  • [35] Gene Selection in Time-Series Gene Expression Data
    Adhikari, Prem Raj
    Upadhyaya, Bimal Babu
    Meng, Chen
    Hollmen, Jaakko
    PATTERN RECOGNITION IN BIOINFORMATICS, 2011, 7036 : 145 - +
  • [36] Prognostic Prediction through Biclustering-Based Classification of Clinical Gene Expression Time Series
    Carreiro, Andre V.
    Anunciacao, Orlando
    Carrico, Joao A.
    Madeira, Sara C.
    JOURNAL OF INTEGRATIVE BIOINFORMATICS, 2011, 8 (03)
  • [37] QUBIC: a qualitative biclustering algorithm for analyses of gene expression data
    Li, Guojun
    Ma, Qin
    Tang, Haibao
    Paterson, Andrew H.
    Xu, Ying
    NUCLEIC ACIDS RESEARCH, 2009, 37 (15)
  • [38] Biclustering On Gene Expression Data
    Shruthi, M. P.
    2017 INTERNATIONAL CONFERENCE ON ALGORITHMS, METHODOLOGY, MODELS AND APPLICATIONS IN EMERGING TECHNOLOGIES (ICAMMAET), 2017,
  • [39] Data Perturbation and Recovery of Time Series Gene Expression Data
    Sarkar, Aisharjya
    Mishra, Prabhat
    Kahveci, Tamer
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (02) : 830 - 842
  • [40] A novel gene-centric clustering algorithm for standardization of time series expression data
    Tsiporkova, Elena
    Boeva, Veselka
    2008 4TH INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2008, : 533 - +