Exploratory parallel hybrid sampling framework for imbalanced data classification

被引:0
|
作者
Zheng, Ming [3 ,4 ]
Zhao, Zhuo [3 ]
Wang, Fei [3 ]
Hu, Xiaowen [3 ]
Xu, Sheng [3 ,4 ]
Li, Wanggen [3 ]
Li, Tong [1 ,2 ]
机构
[1] Yunnan Agr Univ, Big Data Sch, Kunming 650201, Peoples R China
[2] Yunnan Agr Univ, Key Lab Crop Prod & Smart Agr Yunnan Prov, Kunming 650201, Peoples R China
[3] Anhui Normal Univ, Sch Comp & Informat, Wuhu 241002, Peoples R China
[4] Anhui Prov Key Lab Ind Intelligence Data Secur, Wuhu 241002, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Imbalanced data; Oversampling; Undersampling; Parallel hybrid sampling framework; Serial hybrid sampling frameworks; ENSEMBLE; SMOTE;
D O I
10.1016/j.engappai.2024.109428
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current engineering application scenarios often face the challenge of imbalanced data, hybrid sampling is an effective method to deal with the imbalanced data classification issue, which can avoid the issues of overfitting and mistakenly deleting useful majority samples when using oversampling approach and undersampling approach alone. However, at present most of the hybrid sampling approaches are implemented serially, and the implementation of oversampling and undersampling approaches alone will cause mutual interference and influence between them. This study proposes a parallel hybrid sampling framework based on the idea of parallel engineering and theoretically analyzes its superiority. The experimental results show that when applied to five classification algorithms with three performance evaluation metrics,the proposed framework outperforms the two mainstream hybrid sampling frameworks. Moreover, the proposed framework can effectively reduce the time consumption of hybrid sampling process.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Neural Network With a Preference Sampling Paradigm for Imbalanced Data Classification
    Huang, Zhan Ao
    Sang, Yongsheng
    Sun, Yanan
    Lv, Jiancheng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (07) : 9252 - 9266
  • [42] An ensemble imbalanced classification method based on model dynamic selection driven by data partition hybrid sampling
    Gao, Xin
    Ren, Bing
    Zhang, Hao
    Sun, Bohao
    Li, Junliang
    Xu, Jianhang
    He, Yang
    Li, Kangsheng
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 160
  • [43] CVAE-Based Hybrid Sampling Data Augmentation Method and Interpretation for Imbalanced Classification of Gout Disease
    Si, Xiaonan
    Fu, Yifan
    Liu, Xinran
    Wang, Rulin
    Xu, Wenchang
    Wang, Lei
    ADVANCED INTELLIGENT COMPUTING IN BIOINFORMATICS, PT I, ICIC 2024, 2024, 14881 : 49 - 60
  • [44] Dynamic Sampling in Convolutional Neural Networks for Imbalanced Data Classification
    Pouyanfar, Samira
    Tao, Yudong
    Mohan, Anup
    Tian, Haiman
    Kaseb, Ahmed S.
    Gauen, Kent
    Dailey, Ryan
    Aghajanzadeh, Sarah
    Lu, Yung-Hsiang
    Chen, Shu-Ching
    Shyu, Mei-Ling
    IEEE 1ST CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2018), 2018, : 112 - 117
  • [45] Comparison of Sampling Methods for Imbalanced Data Classification in Random Forest
    Paing, May Phu
    Pintavirooj, C.
    Tungjitkusolmun, Supan
    Choomchuay, Somsak
    Hamamoto, Kazuhiko
    2018 11TH BIOMEDICAL ENGINEERING INTERNATIONAL CONFERENCE (BMEICON 2018), 2018,
  • [46] A New Hybrid Under-sampling Approach to Imbalanced Classification Problems
    Peng, Chun-Yang
    Park, You-Jin
    APPLIED ARTIFICIAL INTELLIGENCE, 2022, 36 (01)
  • [47] A GAN-based hybrid sampling method for imbalanced customer classification
    Zhu, Bing
    Pan, Xin
    vanden Broucke, Seppe
    Xiao, Jin
    INFORMATION SCIENCES, 2022, 609 : 1397 - 1411
  • [48] A hybrid imbalanced classification model based on data density
    Shi, Shengnan
    Li, Jie
    Zhu, Dan
    Yang, Fang
    Xu, Yong
    INFORMATION SCIENCES, 2023, 624 : 50 - 67
  • [49] Optimized hybrid imbalanced data sampling for decision tree training
    Wegier, Weronika
    Koziarski, Michal
    Wozniak, Michal
    PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2023 COMPANION, 2023, : 339 - 342
  • [50] Hybrid sampling-based contrastive learning for imbalanced node classification
    Cui, Caixia
    Wang, Jie
    Wei, Wei
    Liang, Jiye
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (03) : 989 - 1001