Dynamic weighted selective ensemble learning algorithm for imbalanced data streams

被引:7
|
作者
Yan, Zhang [1 ,2 ]
Du Hongle [1 ,2 ]
Gang, Ke [3 ]
Lin, Zhang [1 ,2 ]
Chen, Yeh-Cheng [4 ]
机构
[1] Shangluo Univ, Sch Math & Comp Applicat, Shangluo City, Shaanxi, Peoples R China
[2] Shangluo Publ Big Data Res Ctr, Shangluo City, Shaanxi, Peoples R China
[3] Dongguan Polytech, Dept Comp Engn, Dongguan, Guangdong, Peoples R China
[4] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA
来源
JOURNAL OF SUPERCOMPUTING | 2022年 / 78卷 / 04期
关键词
Concept drift; Imbalanced data stream; Data stream mining; Ensemble learning; CONCEPT DRIFT;
D O I
10.1007/s11227-021-04084-w
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data stream mining is one of the hot topics in data mining. Most existing algorithms assume that data stream with concept drift is balanced. However, in real-world, the data streams are imbalanced with concept drift. The learning algorithm will be more complex for the imbalanced data stream with concept drift. In online learning algorithm, the oversampling method is used to select a small number of samples from the previous data block through a certain strategy and add them into the current data block to amplify the current minority class. However, in this method, the number of stored samples, the method of oversampling and the weight calculation of base-classifier all affect the classification performance of ensemble classifier. This paper proposes a dynamic weighted selective ensemble (DWSE) learning algorithm for imbalanced data stream with concept drift. On the one hand, through resampling the minority samples in previous data block, the minority samples of the current data block can be amplified, and the information in the previous data block can be absorbed into building a classifier to reduce the impact of concept drift. The calculation method of information content of every sample is defined, and the resampling method and updating method of the minority samples are given in this paper. On the other hand, because of concept drift, the performance of the base-classifier will be degraded, and the decay factor is usually used to describe the performance degradation of base-classifier. However, the static decay factor cannot accurately describe the performance degradation of the base-classifier with the concept drift. The calculation method of dynamic decay factor of the base-classifier is defined in DWSE algorithm to select sub-classifiers to eliminate according to the attenuation situation, which makes the algorithm better deal with concept drift. Compared with other algorithms, the results show that the DWSE algorithm has better classification performance for majority class samples and minority samples.
引用
收藏
页码:5394 / 5419
页数:26
相关论文
共 50 条
  • [41] Multicriteria Classifier Ensemble Learning for Imbalanced Data
    Wegier, Weronika
    Koziarski, Michal
    Wozniak, Micha
    Wegier, Weronika
    IEEE Access, 2022, 10 : 16807 - 16818
  • [42] An Improved Ensemble Learning for Imbalanced Data Classification
    Yuan, Zhengwu
    Zhao, Pu
    PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, : 408 - 411
  • [43] Multicriteria Classifier Ensemble Learning for Imbalanced Data
    Wegier, Weronika
    Koziarski, Michal
    Wozniak, Micha
    IEEE ACCESS, 2022, 10 : 16807 - 16818
  • [44] A Selective Ensemble Learning Framework for ECG-Based Heartbeat Classification with Imbalanced Data
    Ge, Hongwei
    Sun, Keyi
    Sun, Liang
    Zhao, Mingde
    Wu, Chunguo
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 2753 - 2755
  • [45] Hellinger Distance Weighted Ensemble for imbalanced data stream classification
    Grzyb, Joanna
    Klikowski, Jakub
    Wozniak, Michal
    JOURNAL OF COMPUTATIONAL SCIENCE, 2021, 51
  • [46] Incremental Weighted Ensemble for Data Streams With Concept Drift
    Jiao B.
    Guo Y.
    Yang C.
    Pu J.
    Zheng Z.
    Gong D.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (01): : 92 - 103
  • [47] AN ADAPTIVE SELECTIVE ENSEMBLE FOR DATA STREAMS CLASSIFICATION
    Grossi, Valerio
    Turini, Franco
    ICAART 2011: PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1, 2011, : 136 - 145
  • [48] ROSE: robust online self-adjusting ensemble for continual learning on imbalanced drifting data streams
    Cano, Alberto
    Krawczyk, Bartosz
    MACHINE LEARNING, 2022, 111 (07) : 2561 - 2599
  • [49] The Gradual Resampling Ensemble for mining imbalanced data streams with concept drift
    Ren, Siqi
    Liao, Bo
    Zhu, Wen
    Li, Zeng
    Liu, Wei
    Li, Keqin
    NEUROCOMPUTING, 2018, 286 : 150 - 166
  • [50] Classifier Selection for Highly Imbalanced Data Streams with Minority Driven Ensemble
    Zyblewski, Pawel
    Ksieniewicz, Pawel
    Wozniak, Michal
    ARTIFICIAL INTELLIGENCEAND SOFT COMPUTING, PT I, 2019, 11508 : 626 - 635