Exploratory parallel hybrid sampling framework for imbalanced data classification

被引:0
|
作者
Zheng, Ming [3 ,4 ]
Zhao, Zhuo [3 ]
Wang, Fei [3 ]
Hu, Xiaowen [3 ]
Xu, Sheng [3 ,4 ]
Li, Wanggen [3 ]
Li, Tong [1 ,2 ]
机构
[1] Yunnan Agr Univ, Big Data Sch, Kunming 650201, Peoples R China
[2] Yunnan Agr Univ, Key Lab Crop Prod & Smart Agr Yunnan Prov, Kunming 650201, Peoples R China
[3] Anhui Normal Univ, Sch Comp & Informat, Wuhu 241002, Peoples R China
[4] Anhui Prov Key Lab Ind Intelligence Data Secur, Wuhu 241002, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Imbalanced data; Oversampling; Undersampling; Parallel hybrid sampling framework; Serial hybrid sampling frameworks; ENSEMBLE; SMOTE;
D O I
10.1016/j.engappai.2024.109428
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current engineering application scenarios often face the challenge of imbalanced data, hybrid sampling is an effective method to deal with the imbalanced data classification issue, which can avoid the issues of overfitting and mistakenly deleting useful majority samples when using oversampling approach and undersampling approach alone. However, at present most of the hybrid sampling approaches are implemented serially, and the implementation of oversampling and undersampling approaches alone will cause mutual interference and influence between them. This study proposes a parallel hybrid sampling framework based on the idea of parallel engineering and theoretically analyzes its superiority. The experimental results show that when applied to five classification algorithms with three performance evaluation metrics,the proposed framework outperforms the two mainstream hybrid sampling frameworks. Moreover, the proposed framework can effectively reduce the time consumption of hybrid sampling process.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] A Hybrid Sampling SVM Approach to Imbalanced Data Classification
    Wang, Qiang
    ABSTRACT AND APPLIED ANALYSIS, 2014,
  • [2] CLUS: A New Hybrid Sampling Classification for Imbalanced Data
    Prachuabsupakij, Wanthanee
    PROCEEDINGS OF THE 2015 12TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE), 2015, : 281 - 286
  • [3] Parallel selective sampling method for imbalanced and large data classification
    D'Addabbo, Annarita
    Maglietta, Rosalia
    PATTERN RECOGNITION LETTERS, 2015, 62 : 61 - 67
  • [4] Hybrid sampling for imbalanced data
    Seiffert, Chris
    Khoshgoftaar, Taghi M.
    Van Hulse, Jason
    PROCEEDINGS OF THE 2008 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2008, : 202 - 207
  • [5] Hybrid sampling for imbalanced data
    Seiffert, Chris
    Khoshgoftaar, Taghi M.
    Van Hulse, Jason
    INTEGRATED COMPUTER-AIDED ENGINEERING, 2009, 16 (03) : 193 - 210
  • [6] HSDLM: A Hybrid Sampling With Deep Learning Method for Imbalanced Data Classification
    Hasib, Khan Md
    Towhid, Nurul Akter
    Islam, Md Rafiqul
    INTERNATIONAL JOURNAL OF CLOUD APPLICATIONS AND COMPUTING, 2021, 11 (04) : 1 - 13
  • [7] Framework for imbalanced data classification
    Blaszczyk, Mikolaj
    Jedrzejowicz, Joanna
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KSE 2021), 2021, 192 : 3477 - 3486
  • [8] A cluster-based hybrid sampling approach for imbalanced data classification
    Feng, Shou
    Zhao, Chunhui
    Fu, Ping
    REVIEW OF SCIENTIFIC INSTRUMENTS, 2020, 91 (05):
  • [9] A Hybrid Learning Framework for Imbalanced Classification
    Jiang, Eric P.
    INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2022, 18 (01)
  • [10] A Hybrid Sampling Method for Imbalanced Data
    Gazzah, Sami
    Hechkel, Amina
    Ben Amara, Najoua Essoukri
    2015 IEEE 12TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2015,