A Fast Semi-Supervised Clustering Framework for Large-Scale Time Series Data

被引:18
|
作者
He, Guoliang [1 ]
Pan, Yanzhou [2 ]
Xia, Xuewen [3 ]
He, Jinrong [4 ]
Peng, Rong [1 ]
Xiong, Neal N. [5 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan 430079, Peoples R China
[2] Rice Univ, Engn Dept, Houston, TX 77005 USA
[3] Minnan Normal Univ, Coll Phys & Informat Engn, Zhangzhou 363000, Peoples R China
[4] Yanan Univ, Coll Math & Comp Sci, Yanan 716000, Peoples R China
[5] Northeastern State Univ, Dept Math & Comp Sci, Tahlequah, OK 74464 USA
基金
中国国家自然科学基金;
关键词
Time series analysis; Clustering algorithms; Time measurement; Velocity measurement; Shape measurement; Clustering methods; Contracts; Constraint propagation; semi-supervised learning; similarity measure; time series clustering; CLASSIFICATION;
D O I
10.1109/TSMC.2019.2931731
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Semi-supervised clustering algorithms have several limitations: 1) the computation complexity of them is very high, because calculating the similarity distances of pairs of examples is time-consuming; 2) traditional semi-supervised clustering methods have not considered how to make full use of must-link and cannot-link constraints. In the clustering, the contribution of a few pairwise constraints to the clustering performance is very limited, and some may negatively affect the outcome; and 3) these methods are not effective to handle high dimensional data, especially for time series data. Up to now, few work touched semi-supervised clustering on time series data. To efficiently cluster large-scale time series data, we first tackle contract time series clustering to produce the most accurate clustering results under a contracted time. We propose a semi-supervised time series clustering framework (STSC), which integrates a fast similarity measure and a constraint propagation approach. Based on the proposed framework, two valid semi-supervised clustering algorithms including fssK-means and fssDBSCAN are designed. Experiments on 11 datasets show that our proposed method is efficient and effective for clustering large-scale time series data.
引用
收藏
页码:4201 / 4216
页数:16
相关论文
共 50 条
  • [21] Incremental learning algorithm for large-scale semi-supervised ordinal regression
    Chen, Haiyan
    Jia, Yizhen
    Ge, Jiaming
    Gu, Bin
    NEURAL NETWORKS, 2022, 149 : 124 - 136
  • [22] Semi-Supervised Eigenvectors for Large-Scale Locally-Biased Learning
    Hansen, Toke J.
    Mahoney, Michael W.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 3691 - 3734
  • [23] SOLID: A Large-Scale Semi-Supervised Dataset for Offensive Language Identification
    Rosenthal, Sara
    Atanasova, Pepa
    Karadzhov, Georgi
    Zampieri, Marcos
    Nakov, Preslav
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 915 - 928
  • [24] A semi-supervised clustering algorithm for data exploration
    Bouchachia, A
    Pedrycz, W
    FUZZY SETS AND SYSTEMS - IFSA 2003, PROCEEDINGS, 2003, 2715 : 328 - 337
  • [25] Semi-Supervised Clustering and Aggregation of Relational Data
    Frigui, Hichem
    Hwang, Cheul
    2008 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, VOLS 1-3, 2008, : 1087 - 1092
  • [26] A New semi-supervised clustering for incomplete data
    Goel, Sonia
    Tushir, Meena
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (02) : 727 - 739
  • [27] Fast semi-supervised clustering with enhanced spectral embedding
    Jiao, L. C.
    Shang, Fanhua
    Wang, Fei
    Liu, Yuanyuan
    PATTERN RECOGNITION, 2012, 45 (12) : 4358 - 4369
  • [28] Robust Semi-Supervised Fuzzy C-Means Clustering for Time Series
    Xu, Jiucheng
    Hou, Qinchen
    Qu, Kanglin
    Sun, Yuanhao
    Meng, Xiangru
    Computer Engineering and Applications, 2023, 59 (08): : 73 - 80
  • [29] Fast Semi-Supervised Fuzzy Clustering :Approach and Application
    Cai, Jia-xin
    Yang, Feng
    Feng, Guo-can
    PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 108 - +
  • [30] Fast Semi-supervised Classification Based on Bisecting Clustering
    Liu, Xiaolan
    Hao, Zhifeng
    Liu, Jingao
    Lin, Zhiyong
    2ND IEEE INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER CONTROL (ICACC 2010), VOL. 4, 2010, : 207 - 211