An ensemble learning method with GAN-based sampling and consistency check for anomaly detection of imbalanced data streams with concept drift

被引:2
|
作者
Liu, Yansong [1 ,2 ]
Wang, Shuang [3 ]
Sui, He [4 ]
Zhu, Li [1 ]
机构
[1] Xi An Jiao Tong Univ, Sch Software Engn, Xian, Shaanxi, Peoples R China
[2] Shandong Management Univ, Sch Intelligent Engn, Jinan, Shandong, Peoples R China
[3] Civil Aviat Univ China, Informat Secur Evaluat Ctr Civil Aviat, Tianjin, Peoples R China
[4] Civil Aviat Univ China, Coll Aeronaut Engn, Tianjin, Peoples R China
来源
PLOS ONE | 2024年 / 19卷 / 01期
关键词
DYNAMIC WEIGHTED MAJORITY;
D O I
10.1371/journal.pone.0292140
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A challenge to many real-world data streams is imbalance with concept drift, which is one of the most critical tasks in anomaly detection. Learning nonstationary data streams for anomaly detection has been well studied in recent years. However, most of the researches assume that the class of data streams is relatively balanced. Only a few approaches tackle the joint issue of imbalance and concept drift. To overcome this joint issue, we propose an ensemble learning method with generative adversarial network-based sampling and consistency check (EGSCC) in this paper. First, we design a comprehensive anomaly detection framework that includes an oversampling module by generative adversarial network, an ensemble classifier, and a consistency check module. Next, we introduce double encoders into GAN to better capture the distribution characteristics of imbalanced data for oversampling. Then, we apply the stacking ensemble learning to deal with concept drift. Four base classifiers of SVM, KNN, DT and RF are used in the first layer, and LR is used as meta classifier in second layer. Last but not least, we take consistency check of the incremental instance and check set to determine whether it is anormal by statistical learning, instead of threshold-based method. And the validation set is dynamic updated according to the consistency check result. Finally, three artificial data sets obtained from Massive Online Analysis platform and two real data sets are used to verify the performance of the proposed method from four aspects: detection performance, parameter sensitivity, algorithm cost and anti-noise ability. Experimental results show that the proposed method has significant advantages in anomaly detection of imbalanced data streams with concept drift.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] BASWE: Balanced Accuracy-Based Sliding Window Ensemble for Classification in Imbalanced Data Streams with Concept Drift
    de Oliveira, Douglas Amorim
    Delgado, Karina Valdivia
    Lauretto, Marcelo de Souza
    INTELLIGENT SYSTEMS, BRACIS 2024, PT I, 2025, 15412 : 231 - 246
  • [22] Active Learning Method for Imbalanced Concept Drift Data Stream
    Li Y.-H.
    Wang T.-T.
    Wang S.-G.
    Li D.-Y.
    Zidonghua Xuebao/Acta Automatica Sinica, 2024, 50 (03): : 589 - 606
  • [23] GAN-Based Drift and Anomaly Detection for Open Radio Access Networks
    Gudepu, Venkateswarlu
    Chirumamilla, Bhargav
    Chintapalli, Venkatarami Reddy
    Castoldi, Piero
    Valcarenghi, Luca
    Tamma, Bheemarjuna Reddy
    Kataria, Deepak
    Kondepu, Koteswararao
    2024 IEEE 25TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE SWITCHING AND ROUTING, HPSR 2024, 2024, : 124 - 129
  • [24] A comprehensive ensemble classification techniques detecting and managing concept drift in dynamic imbalanced data streams
    Junaid, K. A. Mohamed
    Paulraj, D.
    Sethukarasi, T.
    WIRELESS NETWORKS, 2025, 31 (01) : 19 - 30
  • [25] An Ensemble Classifier Method for Classifying Data Streams with Recurrent Concept Drift
    Wei, Guiying
    Zhang, Tao
    Wu, Sen
    Zou, Lei
    4TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST 2012), 2012, : 3 - 9
  • [26] AN ENSEMBLE ANOMALY DETECTION WITH IMBALANCED DATA BASED ON ROBOT VISION
    Wang, Yongxiong
    Sun, Shuxin
    Zhong, Jiandong
    INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION, 2016, 31 (02): : 77 - 83
  • [27] GANAD: A GAN-based method for network anomaly detection
    Fu, Jie
    Wang, Lina
    Ke, Jianpeng
    Yang, Kang
    Yu, Rongwei
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2023, 26 (05): : 2727 - 2748
  • [28] GANAD: A GAN-based method for network anomaly detection
    Jie Fu
    Lina Wang
    Jianpeng Ke
    Kang Yang
    Rongwei Yu
    World Wide Web, 2023, 26 : 2727 - 2748
  • [29] On learning guarantees to unsupervised concept drift detection on data streams
    de Mello, Rodrigo F.
    Vaz, Yule
    Grossi, Carlos H.
    Bifet, Albert
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 117 : 90 - 102
  • [30] Ensemble based adaptive over-sampling method for imbalanced data learning in computer aided detection of microaneurysm
    Ren, Fulong
    Cao, Peng
    Li, Wei
    Zhao, Dazhe
    Zaiane, Osmar
    COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2017, 55 : 54 - 67