Scalable classifier-agnostic channel selection for multivariate time series classification

被引:9
|
作者
Dhariyal, Bhaskar [1 ]
Le Nguyen, Thach [1 ]
Ifrim, Georgiana [1 ]
机构
[1] Univ Coll Dublin, Insight Ctr Data Analyt, Sch Comp Sci, Dublin, Ireland
基金
爱尔兰科学基金会;
关键词
Multivariate time series; Channel selection; Scalability; Classification; STATISTICAL COMPARISONS;
D O I
10.1007/s10618-022-00909-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Accuracy is a key focus of current work in time series classification. However, speed and data reduction are equally important in many applications, especially when the data scale and storage requirements rapidly increase. Current multivariate time series classification (MTSC) algorithms need hundreds of compute hours to complete training and prediction. This is due to the nature of multivariate time series data which grows with the number of time series, their length and the number of channels. In many applications, not all the channels are useful for the classification task, hence we require methods that can efficiently select useful channels and thus save computational resources. We propose and evaluate two methods for channel selection. Our techniques work by representing each class by a prototype time series and performing channel selection based on the prototype distance between classes. The main hypothesis is that useful channels enable better separation between classes; hence, channels with a larger distance between class prototypes are more useful. On the UEA MTSC benchmark, we show that these techniques achieve significant data reduction and classifier speedup for similar levels of classification accuracy. Channel selection is applied as a pre-processing step before training state-of-the-art MTSC algorithms and saves about 70% of computation time and data storage with preserved accuracy. Furthermore, our methods enable efficient classifiers, such as ROCKET, to achieve better accuracy than using no selection or greedy forward channel selection. To further study the impact of our techniques, we present experiments on classifying synthetic multivariate time series datasets with more than 100 channels, as well as a real-world case study on a dataset with 50 channels. In both cases, our channel selection methods result in significant data reduction with preserved or improved accuracy.
引用
收藏
页码:1010 / 1054
页数:45
相关论文
共 50 条
  • [1] Scalable classifier-agnostic channel selection for multivariate time series classification
    Bhaskar Dhariyal
    Thach Le Nguyen
    Georgiana Ifrim
    Data Mining and Knowledge Discovery, 2023, 37 : 1010 - 1054
  • [2] Scalable Classification of Univariate and Multivariate Time Series
    Karimi-Bidhendi, Saeed
    Munshi, Faramarz
    Munshi, Ashfaq
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 1598 - 1605
  • [3] Scalable time series classification
    Schaefer, Patrick
    DATA MINING AND KNOWLEDGE DISCOVERY, 2016, 30 (05) : 1273 - 1298
  • [4] Scalable time series classification
    Patrick Schäfer
    Data Mining and Knowledge Discovery, 2016, 30 : 1273 - 1298
  • [5] Agnostic local explanation for time series classification
    Guilleme, Mael
    Masson, Veronique
    Roze, Laurence
    Termier, Alexandre
    2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 432 - 439
  • [6] Statistical Modeling and Signal Selection in Multivariate Time Series Pattern Classification
    Liu, Ruoqian
    Xu, Shen
    Fang, Chen
    Liu, Yung-wen
    Murphey, Yi L.
    Kochhar, Dev S.
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 2853 - 2856
  • [7] Mutual information based feature subset selection in multivariate time series classification
    Ircio, Josu
    Lojo, Aizea
    Mori, Usue
    Lozano, Jose A.
    PATTERN RECOGNITION, 2020, 108 (108)
  • [8] Stacking for multivariate time series classification
    Oscar J. Prieto
    Carlos J. Alonso-González
    Juan J. Rodríguez
    Pattern Analysis and Applications, 2015, 18 : 297 - 312
  • [9] Stacking for multivariate time series classification
    Prieto, Oscar J.
    Alonso-Gonzalez, Carlos J.
    Rodriguez, Juan J.
    PATTERN ANALYSIS AND APPLICATIONS, 2015, 18 (02) : 297 - 312
  • [10] Early classification on multivariate time series
    He, Guoliang
    Duan, Yong
    Peng, Rong
    Jing, Xiaoyuan
    Qian, Tieyun
    Wang, Lingling
    NEUROCOMPUTING, 2015, 149 : 777 - 787