Learning evolving prototypes for imbalanced data stream classification with limited labels

被引:1
|
作者
Wu, Zhonglin [1 ]
Wang, Hongliang [1 ]
Guo, Jingxia [1 ]
Yang, Qinli [1 ]
Shao, Junming [1 ,2 ,3 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Data Min Lab, Chengdu, Peoples R China
[2] Univ Elect Sci & Technol China, Yangtze Delta Reg Inst Quzhou, Huzhou, Peoples R China
[3] Univ Elect Sci & Technol China, Shenzhen Inst Adv Study, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
Data streams; Concept drift; Imbalanced learning; Active learning;
D O I
10.1016/j.ins.2024.120979
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Real-world data streams often exhibit long-tailed distributions with heavy class imbalance, posing great challenges for data stream classification, especially in the case of label scarcity and concept drift. Several active learning methods have been proposed to address this problem by selecting the most valuable instances for labeling. However, existing methods often struggle to dynamically identify the most valuable instances that truly represent the current concept while still requiring a large label budget. In this work, we propose a new algorithm, LEPID, to combine dynamic micro -cluster concept modeling and local entropy modeling to select current important concepts and prototypes. Specifically, we give greater weight to concept drift prototypes and minority prototypes to focus more on those regions that represent current concepts. We use a local entropy strategy based on micro-clusters to select the most valuable instances for labeling and reduce the label budget. Extensive experiments on real-world and synthetic imbalanced datasets show that, compared to state-of-the-art algorithms, our method can naturally adapt to concept drift and dynamically capture the current and most valuable prototypes to achieve better results even in the case of label scarcity.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Imbalanced Data Stream Classification Assisted by Prior Probability Estimation
    Komorniczak, Joanna
    Zyblewski, Pawel
    Ksieniewicz, Pawel
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [22] SGDOL: Self-evolving Generative and Discriminative Online Learning for Data Stream Classification
    Aggarwal, Deeksha
    Senthilnath, J.
    Kumar, Uttam
    Yadav, Vivek
    Kulkarni, Sushant
    Ferdaus, Md Meftahul
    Li Xiaoli
    21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS ICDMW 2021, 2021, : 322 - 330
  • [23] Hellinger Distance Weighted Ensemble for imbalanced data stream classification
    Grzyb, Joanna
    Klikowski, Jakub
    Wozniak, Michal
    JOURNAL OF COMPUTATIONAL SCIENCE, 2021, 51
  • [24] Imbalanced classification by learning hidden data structure
    Zhao, Yang
    Shrivastava, Abhishek K.
    Tsui, Kwok Leung
    IIE TRANSACTIONS, 2016, 48 (07) : 614 - 628
  • [25] Dynamic Curriculum Learning for Imbalanced Data Classification
    Wang, Yiru
    Gan, Weihao
    Yang, Jie
    Wu, Wei
    Yan, Junjie
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 5016 - 5025
  • [26] Adaptive random forests for evolving data stream classification
    Gomes, Heitor M.
    Bifet, Albert
    Read, Jesse
    Barddal, Jean Paul
    Enembreck, Fabricio
    Pfharinger, Bernhard
    Holmes, Geoff
    Abdessalem, Talel
    MACHINE LEARNING, 2017, 106 (9-10) : 1469 - 1495
  • [27] Deep Learning for Imbalanced Multimedia Data Classification
    Yan, Yilin
    Chen, Min
    Shyu, Mei-Ling
    Chen, Shu-Ching
    2015 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2015, : 483 - 488
  • [28] Streaming Random Patches for Evolving Data Stream Classification
    Gomes, Heitor Murilo
    Read, Jesse
    Bifet, Albert
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 240 - 249
  • [29] Adaptive regularized ensemble for evolving data stream classification
    Paim, Aldo M.
    Enembreck, Fabricio
    PATTERN RECOGNITION LETTERS, 2024, 180 : 55 - 61
  • [30] An Improved Ensemble Learning for Imbalanced Data Classification
    Yuan, Zhengwu
    Zhao, Pu
    PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, : 408 - 411