Boosting the oversampling methods based on differential evolution strategies for imbalanced learning

被引:17
|
作者
Korkmaz, Sedat [1 ]
Sahman, Mehmet Akif [2 ]
Cinar, Ahmet Cevahir [3 ]
Kaya, Ersin [1 ]
机构
[1] Konya Tech Univ, Fac Engn & Nat Sci, Dept Comp Engn, Konya, Turkey
[2] Selcuk Univ, Fac Technol, Dept Elect & Elect Engn, Konya, Turkey
[3] Selcuk Univ, Fac Technol, Dept Comp Engn, Konya, Turkey
关键词
Imbalanced datasets; Differential evolution; Oversampling; Imbalanced learning; Class imbalance; Differential evolution strategies; PREPROCESSING METHOD; GLOBAL OPTIMIZATION; SOFTWARE TOOL; SMOTE; CLASSIFICATION; ALGORITHMS; KEEL;
D O I
10.1016/j.asoc.2021.107787
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The class imbalance problem is a challenging problem in the data mining area. To overcome the low classification performance related to imbalanced datasets, sampling strategies are used for balancing the datasets. Oversampling is a technique that increases the minority class samples in various proportions. In this work, these 16 different DE strategies are used for oversampling the imbalanced datasets for better classification. The main aim of this work is to determine the best strategy in terms of Area Under the receiver operating characteristic (ROC) Curve (AUC) and Geometric Mean (G-Mean) metrics. 44 imbalanced datasets are used in experiments. Support Vector Machines (SVM), k-Nearest Neighbor (kNN), and Decision Tree (DT) are used as a classifier in the experiments. The best results are produced by 6th Debohid Strategy (DSt6), 1th Debohid Strategy (DSt1), and 3th Debohid Strategy (DSt3) by using kNN, DT, and SVM classifiers, respectively. The obtained results outperform the 9 state-of-the-art oversampling methods in terms of AUC and G-Mean metrics (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] Radial-Based Approach to Imbalanced Data Oversampling
    Koziarski, Michal
    Krawczyk, Bartosz
    Wozniak, Michal
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2017, 2017, 10334 : 318 - 327
  • [42] Differential evolution with hybrid parameters and mutation strategies based on reinforcement learning
    Tan, Zhiping
    Tang, Yu
    Li, Kangshun
    Huang, Huasheng
    Luo, Shaoming
    SWARM AND EVOLUTIONARY COMPUTATION, 2022, 75
  • [43] DSPOTE: Density-induced Selection Probability-based Oversampling TEchnique for Imbalanced Learning
    Wei, Zhen
    Zhang, Li
    Zhao, Lei
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2165 - 2171
  • [44] A Weakly Supervised Learning-Based Oversampling Framework for Class-Imbalanced Fault Diagnosis
    Qian, Min
    Li, Yan-Fu
    IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (01) : 429 - 442
  • [45] Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE
    Douzas, Georgios
    Bacao, Fernando
    Last, Felix
    INFORMATION SCIENCES, 2018, 465 : 1 - 20
  • [46] An Approach for Mining Imbalanced Datasets Combining Specialized Oversampling and Undersampling Methods
    Jedrzejowicz, Joanna
    Jedrzejowicz, Piotr
    IEEE ACCESS, 2023, 11 : 136782 - 136792
  • [47] Oversampling Methods Combined Clustering and Data Cleaning for Imbalanced Network Data
    Yang, Yang
    Zhao, Qian
    Ruan, Linna
    Gao, Zhipeng
    Huo, Yonghua
    Qiu, Xuesong
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2020, 26 (05): : 1139 - 1155
  • [48] Generative Oversampling Methods for Handling Imbalanced Data in Software Fault Prediction
    Rathore, Santosh Singh
    Chouhan, Satyendra Singh
    Jain, Dixit Kumar
    Vachhani, Aakash Gopal
    IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (02) : 747 - 762
  • [49] CCO: A Cluster Core-Based Oversampling Technique for Improved Class-Imbalanced Learning
    Mondal, Priyobrata
    Ansari, Faizanuddin
    Das, Swagatam
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, : 1 - 13
  • [50] An oversampling algorithm for high-dimensional imbalanced learning with class overlapping
    Yang, Xu
    Xue, Zhen
    Zhang, Liangliang
    Wu, Jianzhen
    KNOWLEDGE AND INFORMATION SYSTEMS, 2025, 67 (02) : 1915 - 1943