An ensemble learning method with GAN-based sampling and consistency check for anomaly detection of imbalanced data streams with concept drift

被引:2
|
作者
Liu, Yansong [1 ,2 ]
Wang, Shuang [3 ]
Sui, He [4 ]
Zhu, Li [1 ]
机构
[1] Xi An Jiao Tong Univ, Sch Software Engn, Xian, Shaanxi, Peoples R China
[2] Shandong Management Univ, Sch Intelligent Engn, Jinan, Shandong, Peoples R China
[3] Civil Aviat Univ China, Informat Secur Evaluat Ctr Civil Aviat, Tianjin, Peoples R China
[4] Civil Aviat Univ China, Coll Aeronaut Engn, Tianjin, Peoples R China
来源
PLOS ONE | 2024年 / 19卷 / 01期
关键词
DYNAMIC WEIGHTED MAJORITY;
D O I
10.1371/journal.pone.0292140
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A challenge to many real-world data streams is imbalance with concept drift, which is one of the most critical tasks in anomaly detection. Learning nonstationary data streams for anomaly detection has been well studied in recent years. However, most of the researches assume that the class of data streams is relatively balanced. Only a few approaches tackle the joint issue of imbalance and concept drift. To overcome this joint issue, we propose an ensemble learning method with generative adversarial network-based sampling and consistency check (EGSCC) in this paper. First, we design a comprehensive anomaly detection framework that includes an oversampling module by generative adversarial network, an ensemble classifier, and a consistency check module. Next, we introduce double encoders into GAN to better capture the distribution characteristics of imbalanced data for oversampling. Then, we apply the stacking ensemble learning to deal with concept drift. Four base classifiers of SVM, KNN, DT and RF are used in the first layer, and LR is used as meta classifier in second layer. Last but not least, we take consistency check of the incremental instance and check set to determine whether it is anormal by statistical learning, instead of threshold-based method. And the validation set is dynamic updated according to the consistency check result. Finally, three artificial data sets obtained from Massive Online Analysis platform and two real data sets are used to verify the performance of the proposed method from four aspects: detection performance, parameter sensitivity, algorithm cost and anti-noise ability. Experimental results show that the proposed method has significant advantages in anomaly detection of imbalanced data streams with concept drift.
引用
收藏
页数:24
相关论文
共 50 条
  • [41] Entropy-based hybrid sampling ensemble learning for imbalanced data
    Dongdong, Li
    Ziqiu, Chi
    Bolu, Wang
    Zhe, Wang
    Hai, Yang
    Wenli, Du
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (07) : 3039 - 3067
  • [42] SDDM: an interpretable statistical concept drift detection method for data streams
    Simona Micevska
    Ahmed Awad
    Sherif Sakr
    Journal of Intelligent Information Systems, 2021, 56 : 459 - 484
  • [43] SDDM: an interpretable statistical concept drift detection method for data streams
    Micevska, Simona
    Awad, Ahmed
    Sakr, Sherif
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2021, 56 (03) : 459 - 484
  • [44] EmSM: Ensemble Mixed Sampling Method for Classifying Imbalanced Intrusion Detection Data
    Jung, Ilok
    Ji, Jaewon
    Cho, Changseob
    ELECTRONICS, 2022, 11 (09)
  • [45] Studies on the GAN-Based Anomaly Detection Methods for the Time Series Data
    Lee, Chang-Ki
    Cheon, Yu-Jeong
    Hwang, Wook-Yeon
    IEEE ACCESS, 2021, 9 : 73201 - 73215
  • [46] An ensemble learning approach for anomaly detection in credit card data with imbalanced and overlapped classes
    Islam, Md Amirul
    Uddin, Md Ashraf
    Aryal, Sunil
    Stea, Giovanni
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2023, 78
  • [47] Ensemble learning method based on CNN for class imbalanced data
    Xin Zhong
    Nan Wang
    The Journal of Supercomputing, 2024, 80 : 10090 - 10121
  • [48] Ensemble learning method based on CNN for class imbalanced data
    Zhong, Xin
    Wang, Nan
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (07): : 10090 - 10121
  • [49] Adaptive Chunk-Based Dynamic Weighted Majority for Imbalanced Data Streams With Concept Drift
    Lu, Yang
    Cheung, Yiu-Ming
    Yan Tang, Yuan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (08) : 2764 - 2778
  • [50] Autoencoder-based Anomaly Detection in Streaming Data with Incremental Learning and Concept Drift Adaptation
    Li, Jin
    Malialis, Kleanthis
    Polycarpou, Marios M.
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,