A Convolutional Deep Clustering Framework for Gene Expression Time Series

被引:7
|
作者
Ozgul, Ozan Frat [1 ]
Bardak, Batuhan [1 ]
Tan, Mehmet [1 ]
机构
[1] TOBB Univ Econ & Technol, Dept Comp Engn, TR-06510 Ankara, Turkey
关键词
Time series analysis; Gene expression; Machine learning; Clustering algorithms; Biological system modeling; Trajectory; Biological information theory; clustering; recurrence plots; deep learning; NF-KAPPA-B; HELICOBACTER-PYLORI; RECURRENCE PLOT;
D O I
10.1109/TCBB.2020.2988985
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The functional or regulatory processes within the cell are explicitly governed by the expression levels of a subset of its genes. Gene expression time series captures activities of individual genes over time and aids revealing underlying cellular dynamics. An important step in high-throughput gene expression time series experiment is clustering genes based on their temporal expression patterns and is conventionally achieved by unsupervised machine learning techniques. However, most of the clustering techniques either suffer from the short length of gene expression time series or ignore temporal structure of the data. In this work, we propose DeepTrust, a novel deep learning-based framework for gene expression time series clustering which can overcome these issues. DeepTrust initially transforms time series data into images to obtain richer data representations. Afterwards, a deep convolutional clustering algorithm is applied on the constructed images. Analyses on both simulated and biological data sets exhibit the efficiency of this new framework, compared to widely used clustering techniques. We also utilize enrichment analyses to illustrate the biological plausibility of the clusters detected by DeepTrust. Our code and data are available from http://github.com/tanlab/DeepTrust.
引用
收藏
页码:2198 / 2207
页数:10
相关论文
共 50 条
  • [41] An integrated time series gene expression data analysis pipeline with a fuzzy clustering method to assess expression patterns
    Yankilevich, P.
    Barrero, P. R.
    Zwir, I.
    2007 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-4, 2007, : 1362 - +
  • [42] AngClust: Angle Feature-Based Clustering for Short Time Series Gene Expression Profiles
    Li, Aimin
    Xiong, Siqi
    Li, Junhuai
    Mallik, Saurav
    Liu, Yajun
    Fei, Rong
    Zhou, Hongfang
    Liu, Guangming
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (02) : 1574 - 1580
  • [43] Integration of Functional Information of Genes in Fuzzy Clustering of Short Time Series Gene Expression Data
    Anand, Ashish
    Pal, Nikhil Ranjan
    Suganthan, Ponnuthurai Nagaratnam
    2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2010,
  • [44] Microarray Time-Series Data Clustering via Multiple Alignment of Gene Expression Profiles
    Subhani, Numanul
    Ngom, Alioune
    Rueda, Luis
    Burden, Conrad
    PATTERN RECOGNITION IN BIOINFORMATICS, PROCEEDINGS, 2009, 5780 : 377 - +
  • [45] Improved robustness in time series analysis of gene expression data by polynomial model based clustering
    Hirsch, Michael
    Tucker, Allan
    Swift, Stephen
    Martin, Nigel
    Orengo, Christine
    Kellam, Paul
    Liu, Xiaohui
    COMPUTATIONAL LIFE SCIENCES II, PROCEEDINGS, 2006, 4216 : 1 - 10
  • [46] DHC: A density-based hierarchical clustering method for time series gene expression data
    Jiang, DX
    Pei, J
    Zhang, AD
    THIRD IEEE SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING - BIBE 2003, PROCEEDINGS, 2003, : 393 - 400
  • [47] Clustering gene expression time series data using an infinite Gaussian process mixture model
    McDowell, Ian C.
    Manandhar, Dinesh
    Vockley, Christopher M.
    Schmid, Amy K.
    Reddy, Timothy E.
    Engelhardt, Barbara E.
    PLOS COMPUTATIONAL BIOLOGY, 2018, 14 (01)
  • [48] Toward a Framework for Seasonal Time Series Forecasting Using Clustering
    Leverger, Colin
    Malinowski, Simon
    Guyet, Thomas
    Lemaire, Vincent
    Bondu, Alexis
    Termier, Alexandre
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2019, PT I, 2019, 11871 : 328 - 340
  • [49] Clustering and machine learning framework for medical time series classification
    Ruiperez-Campillo, Samuel
    Reiss, Michael
    Ramirez, Elisa
    Cebrian, Antonio
    Millet, Jose
    Castells, Francisco
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2024, 44 (03) : 521 - 533
  • [50] Gene expression data clustering and visualization based on a binary hierarchical clustering framework
    Szeto, LK
    Liew, AWC
    Yan, H
    Tang, SS
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2003, 14 (04): : 341 - 362