Time series representation and similarity based on local autopatterns

被引:1
|
作者
Mustafa Gokce Baydogan
George Runger
机构
[1] Boğaziçi University,Department of Industrial Engineering
[2] Arizona State University,School of Computing, Informatics and Decision Systems Engineering
来源
关键词
Time series; Similarity; Pattern discovery; Autoregression; Regression tree;
D O I
暂无
中图分类号
学科分类号
摘要
Time series data mining has received much greater interest along with the increase in temporal data sets from different domains such as medicine, finance, multimedia, etc. Representations are important to reduce dimensionality and generate useful similarity measures. High-level representations such as Fourier transforms, wavelets, piecewise polynomial models, etc., were considered previously. Recently, autoregressive kernels were introduced to reflect the similarity of the time series. We introduce a novel approach to model the dependency structure in time series that generalizes the concept of autoregression to local autopatterns. Our approach generates a pattern-based representation along with a similarity measure called learned pattern similarity (LPS). A tree-based ensemble-learning strategy that is fast and insensitive to parameter settings is the basis for the approach. Then, a robust similarity measure based on the learned patterns is presented. This unsupervised approach to represent and measure the similarity between time series generally applies to a number of data mining tasks (e.g., clustering, anomaly detection, classification). Furthermore, an embedded learning of the representation avoids pre-defined features and an extraction step which is common in some feature-based approaches. The method generalizes in a straightforward manner to multivariate time series. The effectiveness of LPS is evaluated on time series classification problems from various domains. We compare LPS to eleven well-known similarity measures. Our experimental results show that LPS provides fast and competitive results on benchmark datasets from several domains. Furthermore, LPS provides a research direction and template approach that breaks from the linear dependency models to potentially foster other promising nonlinear approaches.
引用
收藏
页码:476 / 509
页数:33
相关论文
共 50 条
  • [1] Time series representation and similarity based on local autopatterns
    Baydogan, Mustafa Gokce
    Runger, George
    DATA MINING AND KNOWLEDGE DISCOVERY, 2016, 30 (02) : 476 - 509
  • [2] Trend and Value based Time Series Representation for Similarity Search
    Kane, Aminata
    2017 IEEE THIRD INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2017), 2017, : 252 - 259
  • [3] Similarity measure based on multidimensional shape feature representation for time series
    Li, H.-L., 1600, Systems Engineering Society of China (33):
  • [4] Randomized trees for time series representation and similarity
    Gorgulu, Berk
    Baydogan, Mustafa Gokce
    PATTERN RECOGNITION, 2021, 120
  • [5] Feature Representation and Similarity Measure Based on Covariance Sequence for Multivariate Time Series
    Li, Hailin
    Lin, Chunpei
    Wan, Xiaoji
    Li, Zhengxin
    IEEE ACCESS, 2019, 7 : 67018 - 67026
  • [6] Similarity search based on shape representation in time-series data sets
    Jiang, Rong
    Li, Deyi
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2000, 37 (05): : 601 - 608
  • [7] A Bit Level Representation for Time Series Data Mining with Shape Based Similarity
    Anthony Bagnall
    Chotirat “Ann” Ratanamahatana
    Eamonn Keogh
    Stefano Lonardi
    Gareth Janacek
    Data Mining and Knowledge Discovery, 2006, 13 : 11 - 40
  • [8] An Enhanced Binary Symbolic Representation for Time Series Data Mining Based Similarity
    Sun, Meiyu
    Fang, Jianan
    2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 7130 - 7134
  • [9] A bit level representation for time series data mining with shape based similarity
    Bagnall, Anthony
    Ratanamahatana, Chotirat 'Ann'
    Keogh, Eamonn
    Lonardi, Stefano
    Janacek, Gareth
    DATA MINING AND KNOWLEDGE DISCOVERY, 2006, 13 (01) : 11 - 40
  • [10] Similarity Preserving Representation Learning for Time Series Clustering
    Lei, Qi
    Yi, Jinfeng
    Vaculin, Roman
    Wu, Lingfei
    Dhillon, Inderjit S.
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2845 - 2851