Data Stream Clustering: Introducing Recursively Extendable Aggregation Functions for Incremental Cluster Fusion Processes

被引:0
|
作者
Urio-Larrea, A. [1 ]
Camargo, H. [2 ]
Lucca, G. [3 ]
Asmus, T. [4 ,5 ]
Marco-Detchart, C. [1 ]
Schick, L. [2 ]
Lopez-Molina, C. [1 ]
Andreu-Perez, J. [6 ]
Bustince, H. [1 ]
Dimuro, G. P. [4 ,5 ]
机构
[1] Univ Publ Navarra, Dept Estadist, Pamplona 31006, Spain
[2] Univ Fed Sao Carlos, Dept Computac, BR-13565905 Sao Carlos, Brazil
[3] Univ Catolica Pelotas, Ctr Ciencias Sociais & Tecnol, BR-96015560 Pelotas, Brazil
[4] Univ Fed Rio Grande, Inst Matemat Estat & Fisisca, BR-96203900 Rio Grande, Brazil
[5] Univ Fed Rio Grande, Ctr Ciencias Computacionais, BR-96203900 Rio Grande, Brazil
[6] Univ Essex, Sch Comp Sci & Elect Engn, Colchester, England
基金
巴西圣保罗研究基金会;
关键词
Aggregation functions; data streams (DSs); fuzzy clustering; overlap indices; similarity measures;
D O I
10.1109/TCYB.2025.3527862
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In data stream (DS) learning, the system has to extract knowledge from data generated continuously, usually at high speed and in large volumes, making it impossible to store the entire set of data to be processed in batch mode. Hence, machine learning models must be built incrementally by processing the incoming examples, as data arrive, while updating the model to be compatible with the current data. In fuzzy DS clustering, the model can either absorb incoming data into existing clusters or initiate a new cluster. As the volume of data increases, there is a possibility that the clusters will overlap to the point where it is convenient to merge two or more clusters into one. Then, a cluster comparison measure (CM) should be applied, to decide whether such clusters should be combined, also in an incremental manner. This defines an incremental fusion process based on aggregation functions that can aggregate the incoming inputs without storing all the previous inputs. The objective of this article is to solve the fuzzy DS clustering problem of incrementally comparing fuzzy clusters on a formal basis. First, we formalize and operationalize incremental fusion processes of fuzzy clusters by introducing recursively extendable (RE) aggregation functions, studying construction methods and different classes of such functions. Second, we propose two approaches to compare clusters: 1) similarity and 2) overlapping between clusters, based on RE aggregation functions. Finally, we analyze the effect of those incremental CMs on the online and offline phases of the well-known fuzzy clustering algorithm d-FuzzStream, showing that our new approach outperforms the original algorithm and presents better or comparable performance to other state-of-the-art DS clustering algorithms found in the literature.
引用
收藏
页码:1421 / 1435
页数:15
相关论文
共 40 条
  • [1] Split incremental clustering algorithm of mixed data stream
    Siwar Gorrab
    Fahmi Ben Rejab
    Kaouther Nouira
    Progress in Artificial Intelligence, 2024, 13 : 51 - 64
  • [2] Incremental Clustering Approach for Evolving Trajectory Data Stream
    Shein, Thi Thi
    Puntheeranurak, Sutheera
    2018 6TH INTERNATIONAL ELECTRICAL ENGINEERING CONGRESS (IEECON), 2018,
  • [3] Split incremental clustering algorithm of mixed data stream
    Gorrab, Siwar
    Ben Rejab, Fahmi
    Nouira, Kaouther
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2024, 13 (01) : 51 - 64
  • [4] Incremental Nearest Neighborhood Graph for Data Stream Clustering
    Louhi, Ibrahim
    Boudjeloud-Assala, Lydia
    Tamisier, Thomas
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 2468 - 2475
  • [5] Supervised Adaptive Incremental Clustering for data stream of chunks
    Zheng, Laiwen
    Huo, Hong
    Guo, Yiyou
    Fang, Tao
    NEUROCOMPUTING, 2017, 219 : 502 - 517
  • [6] An Incremental Algorithm Based on Irregular Grid for Clustering Data Stream
    Yin, Guisheng
    Yu, Xiang
    Yang, Guang
    2008 4TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-31, 2008, : 5680 - 5684
  • [7] Incremental and Adaptive Clustering Stream Data over Sliding Window
    Dang, Xuan Hong
    Lee, Vincent C. S.
    Ng, Wee Keong
    Ong, Kok Leong
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2009, 5690 : 660 - +
  • [8] Incremental clustering algorithm based on rough reduction for data stream
    College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
    Xinan Jiaotong Daxue Xuebao, 2009, 5 (637-643+653):
  • [9] Incremental clustering of data stream using real ants behavior
    Masmoudi, Nesrine
    Azzag, Hanane
    Lebbah, Mustapha
    Bertelle, Cyrille
    2014 SIXTH WORLD CONGRESS ON NATURE AND BIOLOGICALLY INSPIRED COMPUTING (NABIC), 2014, : 262 - 268
  • [10] Arbitrary shape cluster algorithm for clustering data stream
    Department of Computer Science, Sun Yat-Sen University, Guangzhou 510275, China
    Ruan Jian Xue Bao, 2006, 3 (379-387):