Boosting the oversampling methods based on differential evolution strategies for imbalanced learning

被引:17
|
作者
Korkmaz, Sedat [1 ]
Sahman, Mehmet Akif [2 ]
Cinar, Ahmet Cevahir [3 ]
Kaya, Ersin [1 ]
机构
[1] Konya Tech Univ, Fac Engn & Nat Sci, Dept Comp Engn, Konya, Turkey
[2] Selcuk Univ, Fac Technol, Dept Elect & Elect Engn, Konya, Turkey
[3] Selcuk Univ, Fac Technol, Dept Comp Engn, Konya, Turkey
关键词
Imbalanced datasets; Differential evolution; Oversampling; Imbalanced learning; Class imbalance; Differential evolution strategies; PREPROCESSING METHOD; GLOBAL OPTIMIZATION; SOFTWARE TOOL; SMOTE; CLASSIFICATION; ALGORITHMS; KEEL;
D O I
10.1016/j.asoc.2021.107787
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The class imbalance problem is a challenging problem in the data mining area. To overcome the low classification performance related to imbalanced datasets, sampling strategies are used for balancing the datasets. Oversampling is a technique that increases the minority class samples in various proportions. In this work, these 16 different DE strategies are used for oversampling the imbalanced datasets for better classification. The main aim of this work is to determine the best strategy in terms of Area Under the receiver operating characteristic (ROC) Curve (AUC) and Geometric Mean (G-Mean) metrics. 44 imbalanced datasets are used in experiments. Support Vector Machines (SVM), k-Nearest Neighbor (kNN), and Decision Tree (DT) are used as a classifier in the experiments. The best results are produced by 6th Debohid Strategy (DSt6), 1th Debohid Strategy (DSt1), and 3th Debohid Strategy (DSt3) by using kNN, DT, and SVM classifiers, respectively. The obtained results outperform the 9 state-of-the-art oversampling methods in terms of AUC and G-Mean metrics (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Fuzzy rule-based oversampling technique for imbalanced and incomplete data learning
    Liu, Gencheng
    Yang, Youlong
    Li, Benchong
    KNOWLEDGE-BASED SYSTEMS, 2018, 158 : 154 - 174
  • [22] Optimal Weighted Extreme Learning Machine for Imbalanced Learning with Differential Evolution
    Ri, JongHyok
    Liu, Liang
    Liu, Yong
    Wu, Huifeng
    Huang, Wenliang
    Kim, Hun
    IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2018, 13 (03) : 32 - 47
  • [23] CL-SR: Boosting Imbalanced Image Classification with Contrastive Learning and Synthetic Minority Oversampling Technique Based on Rough Set Theory Integration
    Gao, Xiaoling
    Jamil, Nursuriati
    Ramli, Muhammad Izzad
    APPLIED SCIENCES-BASEL, 2024, 14 (23):
  • [24] Ensemble weighted extreme learning machine for imbalanced data classification based on differential evolution
    Zhang, Yong
    Liu, Bo
    Cai, Jing
    Zhang, Suhua
    NEURAL COMPUTING & APPLICATIONS, 2017, 28 : S259 - S267
  • [25] Ensemble weighted extreme learning machine for imbalanced data classification based on differential evolution
    Yong Zhang
    Bo Liu
    Jing Cai
    Suhua Zhang
    Neural Computing and Applications, 2017, 28 : 259 - 267
  • [26] Boosting weighted ELM for imbalanced learning
    Li, Kuan
    Kong, Xiangfei
    Lu, Zhi
    Liu Wenyin
    Yin, Jianping
    NEUROCOMPUTING, 2014, 128 : 15 - 21
  • [27] Oversampling Methods for Classification of Imbalanced Breast Cancer Malignancy Data
    Krawczyk, Bartosz
    Jelen, Lukasz
    Krzyzak, Adam
    Fevens, Thomas
    COMPUTER VISION AND GRAPHICS, 2012, 7594 : 483 - 490
  • [28] A Boundary-Information-Based Oversampling Approach to Improve Learning Performance for Imbalanced Datasets
    Li, Der-Chiang
    Shi, Qi-Shi
    Lin, Yao-San
    Lin, Liang-Sian
    ENTROPY, 2022, 24 (03)
  • [29] A novel oversampling technique for class-imbalanced learning based on SMOTE and natural neighbors
    Li, Junnan
    Zhu, Qingsheng
    Wu, Quanwang
    Fan, Zhu
    INFORMATION SCIENCES, 2021, 565 : 438 - 455
  • [30] A review of boosting methods for imbalanced data classification
    Li, Qiujie
    Mao, Yaobin
    PATTERN ANALYSIS AND APPLICATIONS, 2014, 17 (04) : 679 - 693