MVStream: Multiview Data Stream Clustering

被引:37
|
作者
Huang, Ling [1 ,2 ,3 ]
Wang, Chang-Dong [1 ,2 ,3 ]
Chao, Hong-Yang [1 ,3 ]
Yu, Philip S. [4 ,5 ]
机构
[1] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou 510006, Peoples R China
[2] Sun Yat Sen Univ, Guangdong Prov Key Lab Computat Sci, Guangzhou 510006, Peoples R China
[3] Minist Educ, Key Lab Machine Intelligence & Adv Comp, Guangzhou 510006, Peoples R China
[4] Univ Illinois, Dept Comp Sci, Chicago, IL 60607 USA
[5] Tsinghua Univ, Inst Data Sci, Beijing 100084, Peoples R China
关键词
Clustering algorithms; Shape; Task analysis; Support vector machines; Indexes; Data models; Computer science; Clustering; clusters of arbitrary shapes; data stream; multiview; support vector (SV); ALGORITHM;
D O I
10.1109/TNNLS.2019.2944851
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article studies a new problem of data stream clustering, namely, multiview data stream (MVStream) clustering. Although many data stream clustering algorithms have been developed, they are restricted to the single-view streaming data, and clustering MVStreams still remains largely unsolved. In addition to the many issues encountered by the conventional single-view data stream clustering, such as capturing cluster evolution and discovering clusters of arbitrary shapes under the limited computational resources, the main challenge of MVStream clustering lies in integrating information from multiple views in a streaming manner and abstracting summary statistics from the integrated features simultaneously. In this article, we propose a novel MVStream clustering algorithm for the first time. The main idea is to design a multiview support vector domain description (MVSVDD) model, by which the information from multiple insufficient views can be integrated, and the outputting support vectors (SVs) are utilized to abstract the summary statistics of the historical multiview data objects. Based on the MVSVDD model, a new multiview cluster labeling method is designed, whereby clusters of arbitrary shapes can be discovered for each view. By tracking the cluster labels of SVs in each view, the cluster evolution associated with concept drift can be captured. Since the SVs occupy only a small portion of data objects, the proposed MVStream algorithm is quite efficient with the limited computational resources. Extensive experiments are conducted to demonstrate the effectiveness and efficiency of the proposed method.
引用
收藏
页码:3482 / 3496
页数:15
相关论文
共 50 条
  • [21] Research on data stream clustering algorithms
    Shifei Ding
    Fulin Wu
    Jun Qian
    Hongjie Jia
    Fengxiang Jin
    Artificial Intelligence Review, 2015, 43 : 593 - 600
  • [22] Research on data stream clustering algorithms
    Ding, Shifei
    Wu, Fulin
    Qian, Jun
    Jia, Hongjie
    Jin, Fengxiang
    ARTIFICIAL INTELLIGENCE REVIEW, 2015, 43 (04) : 593 - 600
  • [23] An Improved Data Stream Algorithm for Clustering
    Kim, Sang-Sub
    Ahn, Hee-Kap
    LATIN 2014: THEORETICAL INFORMATICS, 2014, 8392 : 273 - 284
  • [24] Sparse Subspace Clustering for Stream Data
    Chen, Ken
    Tang, Yong
    Wei, Long
    Wang, Pengfei
    Liu, Yong
    Jin, Zhongming
    IEEE ACCESS, 2021, 9 : 57271 - 57279
  • [25] A survey on data stream clustering and classification
    Hai-Long Nguyen
    Yew-Kwong Woon
    Wee-Keong Ng
    Knowledge and Information Systems, 2015, 45 : 535 - 569
  • [26] Clustering Models for Data Stream Mining
    Mythily, R.
    Banu, Aisha
    Raghunathan, Shriram
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES, ICICT 2014, 2015, 46 : 619 - 626
  • [27] Data Stream Clustering: Challenges and Issues
    Khalilian, Madjid
    Mustapha, Norwati
    INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS (IMECS 2010), VOLS I-III, 2010, : 566 - +
  • [28] Subgraph Propagation and Contrastive Calibration for Incomplete Multiview Data Clustering
    Dong, Zhibin
    Jin, Jiaqi
    Xiao, Yuyang
    Xiao, Bin
    Wang, Siwei
    Liu, Xinwang
    Zhu, En
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (02) : 3218 - 3230
  • [29] Multiview Spectral Clustering of High-Dimensional Observational Data
    Roman-Messina, A.
    Castro-Arvizu, Claudia M.
    Castillo-Tapia, Alejandro
    Murillo-Aguirre, Erlan R.
    Rodriguez-Villalon, O.
    IEEE ACCESS, 2023, 11 : 115884 - 115893
  • [30] Simultaneous multi-graph learning and clustering for multiview data
    Ma, Xuanlong
    Yan, Xueming
    Liu, Jingfa
    Zhong, Guo
    INFORMATION SCIENCES, 2022, 593 : 472 - 487