Continuous representations of time-series gene expression data

被引:170
|
作者
Bar-Joseph, Z
Gerber, GK
Gifford, DK
Jaakkola, TS
Simon, I
机构
[1] MIT, Comp Sci Lab, Cambridge, MA 02139 USA
[2] MIT, Artificial Intelligence Lab, Cambridge, MA 02139 USA
[3] Whitehead Inst Biomed Res, Cambridge, MA 02142 USA
关键词
time series expression data; missing value estimation; clustering; alignment;
D O I
10.1089/10665270360688057
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We present algorithms for time-series gene expression analysis that permit the principled estimation of unobserved time points, clustering, and dataset alignment. Each expression profile is modeled as a cubic spline (piecewise polynomial) that is estimated from the observed data and every time point influences the overall smooth expression curve. We constrain the spline coefficients of genes in the same class to have similar expression patterns, while also allowing for gene specific parameters. We show that unobserved time points can be reconstructed using our method with 10-15% less error when compared to previous best methods. Our clustering algorithm operates directly on the continuous representations of gene expression profiles, and we demonstrate that this is particularly effective when applied to nonuniformly sampled data. Our continuous alignment algorithm also avoids difficulties encountered by discrete approaches. In particular, our method allows for control of the number of degrees of freedom of the warp through the specification of parameterized functions, which helps to avoid overfitting. We demonstrate that our algorithm produces stable low-error alignments on real expression data and further show a specific application to yeast knock-out data that produces biologically meaningful results.
引用
收藏
页码:341 / 356
页数:16
相关论文
共 50 条
  • [41] A Boolean network inference from time-series gene expression data using a genetic algorithm
    Barman, Shohag
    Kwon, Yung-Keun
    BIOINFORMATICS, 2018, 34 (17) : 927 - 933
  • [42] On Graph Time-Series Representations for Temporal Networks
    Rossi, Ryan A.
    Ahmed, Nesreen K.
    Park, Namyong
    COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 14 - 18
  • [43] Time-Series Information and Unsupervised Learning of Representations
    Ryabko, Daniil
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2020, 66 (03) : 1702 - 1713
  • [44] ARMA APPROXIMATIONS AND REPRESENTATIONS OF A STATIONARY TIME-SERIES
    POURAHMADI, M
    SANKHYA-THE INDIAN JOURNAL OF STATISTICS SERIES B, 1992, 54 : 235 - 241
  • [45] EXPLORING COMPLEX TIME-SERIES REPRESENTATIONS FOR RIEMANNIAN MACHINE LEARNING OF RADAR DATA
    Brooks, Daniel A.
    Schwander, Olivier
    Barbaresco, Frederic
    Schneider, Jean-Yves
    Cord, Matthieu
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3672 - 3676
  • [46] Soft computing methods to predict gene regulatory networks: An integrative approach on time-series gene expression data
    Chan, Zeke S. H.
    Havukkala, Ilkka
    Jain, Vishal
    Hu, Yingjie
    Kasabov, Nikola
    APPLIED SOFT COMPUTING, 2008, 8 (03) : 1189 - 1199
  • [47] CONTINUOUS HYDROLOGICAL TIME-SERIES DISCRETIZATION
    TAVARES, LV
    JOURNAL OF THE HYDRAULICS DIVISION-ASCE, 1975, 101 (NHY1): : 49 - 63
  • [48] A System for Retrieving Time-Series Data Based on Linguistic Expression
    Otsuka, Naoya
    Hasui, Daiki
    Matsushita, Mitsunori
    2013 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2013, : 252 - 256
  • [49] Modeling and analysis of gene expression time-series based on co-expression
    Möller-Levet, CS
    Yin, HJ
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2005, 15 (04) : 311 - 322
  • [50] Time-Series Data Mining
    Esling, Philippe
    Agon, Carlos
    ACM COMPUTING SURVEYS, 2012, 45 (01)