Greedy Gaussian segmentation of multivariate time series

被引:1
|
作者
David Hallac
Peter Nystrup
Stephen Boyd
机构
[1] Stanford University,
[2] Technical University of Denmark,undefined
关键词
Time series analysis; Change-point detection; Financial regimes; Text segmentation; Covariance regularization; Greedy algorithms; 37M10: Time series analysis;
D O I
暂无
中图分类号
学科分类号
摘要
We consider the problem of breaking a multivariate (vector) time series into segments over which the data is well explained as independent samples from a Gaussian distribution. We formulate this as a covariance-regularized maximum likelihood problem, which can be reduced to a combinatorial optimization problem of searching over the possible breakpoints, or segment boundaries. This problem can be solved using dynamic programming, with complexity that grows with the square of the time series length. We propose a heuristic method that approximately solves the problem in linear time with respect to this length, and always yields a locally optimal choice, in the sense that no change of any one breakpoint improves the objective. Our method, which we call greedy Gaussian segmentation (GGS), easily scales to problems with vectors of dimension over 1000 and time series of arbitrary length. We discuss methods that can be used to validate such a model using data, and also to automatically choose appropriate values of the two hyperparameters in the method. Finally, we illustrate our GGS approach on financial time series and Wikipedia text data.
引用
收藏
页码:727 / 751
页数:24
相关论文
共 50 条
  • [41] Representation learning for unsupervised heterogeneous multivariate time series segmentation and its application
    Kim, Hyunjoong
    Kim, Han Kyul
    Kim, Misuk
    Park, Jooseoung
    Cho, Sungzoon
    Im, Keyng Bin
    Ryu, Chang Ryeol
    COMPUTERS & INDUSTRIAL ENGINEERING, 2019, 130 : 272 - 281
  • [42] Joint segmentation of multivariate astronomical time series: Bayesian sampling with a hierarchical model
    Dobigeon, Nicolas
    Tourneret, Jean-Yves
    Scargle, Jeffrey D.
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2007, 55 (02) : 414 - 423
  • [43] Adaptive G-G clustering for fuzzy segmentation of multivariate time series
    Wang, Ling
    Zhu, Hui
    Jia, Gaofeng
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2020, 34 (09) : 1353 - 1367
  • [44] A hybrid segmentation method for multivariate time series based on the dynamic factor model
    Sun, Zhubin
    Liu, Xiaodong
    Wang, Lizhu
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2017, 31 (06) : 1291 - 1304
  • [45] tGLAD: A Sparse Graph Recovery Based Approach for Multivariate Time Series Segmentation
    Imani, Shima
    Shrivastava, Harsh
    ADVANCED ANALYTICS AND LEARNING ON TEMPORAL DATA, AALTD 2023, 2023, 14343 : 176 - 189
  • [46] Adaptive Segmentation of Multivariate Time Series with FastICA and G-G Clustering
    Wang L.
    Li Z.-Z.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (05): : 1235 - 1244
  • [47] Simulation of multivariate non-gaussian autoregressive time series with given autocovariance and marginals
    Kugiumtzis, Dimitris
    Bora-Senta, Efthimia
    SIMULATION MODELLING PRACTICE AND THEORY, 2014, 44 : 42 - 53
  • [48] Statistical inference of multivariate distribution parameters for non-Gaussian distributed time series
    Repetowicz, P
    Richmond, P
    ACTA PHYSICA POLONICA B, 2005, 36 (09): : 2785 - 2796
  • [49] Fast and exact synthesis of stationary multivariate Gaussian time series using circulant embedding
    Helgason, Hannes
    Pipiras, Vladas
    Abry, Patrice
    SIGNAL PROCESSING, 2011, 91 (05) : 1123 - 1133
  • [50] Non-linear autoregressive time series with multivariate Gaussian mixtures as marginal distributions
    Glasbey, CA
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2001, 50 : 143 - 154