Analysis of imbalanced data using cost-sensitive learning

被引:0
|
作者
Kim, Sojin [1 ]
Song, Jongwoo [1 ]
机构
[1] Ewha Womans Univ, Dept Stat, Seoul, South Korea
关键词
Imbalanced classification; cost-sensitive learning; classification performance; hybrid classification; SMOTE;
D O I
10.1080/03610926.2025.2472792
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Typically, classification algorithms strive to maximize the accuracy. However, when dealing with significantly imbalanced data, accuracy may not be the most suitable metric. We believe that the most effective approach for handling imbalanced cases is to minimize the total costs. Unfortunately, precise costs for misclassification are often unavailable in real-world scenarios. To address this problem, we offer a simple and efficient search algorithm for cost-sensitive learning. We also introduce a new performance metric, imbalanced data classification performance (IDCP), which combines the F-score and the area under the curve (AUC). By utilizing the imbalance ratio (IR) as a crucial factor, we use IDCP to determine optimal weights in cost-sensitive learning. Through extensive experiments, we show that our method can find the optimal decision boundary in imbalanced datasets. Our code is available at https://github.com/sssojin/Imbalanced_Classification
引用
收藏
页数:15
相关论文
共 50 条
  • [21] COST-SENSITIVE SPFCNN MINER FOR CLASSIFICATION OF IMBALANCED DATA
    Zhao, Linchang
    Shang, Zhaowei
    Zhao, Ling
    Wei, Yu
    Tang, Yuan Yan
    PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION (ICWAPR), 2019, : 51 - 57
  • [23] Cost-Sensitive Learning of Fuzzy Rules for Imbalanced Classification Problems Using FURIA
    Palacios, Ana
    Trawinski, Krzysztof
    Cordon, Oscar
    Sanchez, Luciano
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2014, 22 (05) : 643 - 675
  • [24] Cost-sensitive learning strategies for high-dimensional and imbalanced data: a comparative study
    Pes B.
    Lai G.
    Pes, Barbara (pes@unica.it), 1600, PeerJ Inc. (07):
  • [25] Cost-sensitive Fuzzy Multiple Kernel Learning for imbalanced problem
    Wang, Zhe
    Wang, Bolu
    Cheng, Yang
    Li, Dongdong
    Zhang, Jing
    NEUROCOMPUTING, 2019, 366 : 178 - 193
  • [26] Cost-sensitive learning strategies for high-dimensional and imbalanced data: a comparative study
    Pes, Barbara
    Lai, Giuseppina
    PEERJ COMPUTER SCIENCE, 2021, 7
  • [27] Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry
    Shamsudin, Haziqah
    Yusof, Umi Kalsom
    Kashif, Fizza
    Isa, Iza Sazanita
    JORDAN JOURNAL OF ELECTRICAL ENGINEERING, 2023, 9 (04): : 552 - 565
  • [28] Cost-sensitive continuous ensemble kernel learning for imbalanced data streams with concept drift
    Chen, Yingying
    Yang, Xiaowei
    Dai, Hong-Liang
    KNOWLEDGE-BASED SYSTEMS, 2024, 284
  • [29] A Statistical Approach to Cost-Sensitive AdaBoost for Imbalanced Data Classification
    Bei, Honghan
    Wang, Yajie
    Ren, Zhaonuo
    Jiang, Shuo
    Li, Keran
    Wang, Wenyang
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021
  • [30] Cost-Sensitive Variational Autoencoding Classifier for Imbalanced Data Classification
    Liu, Fen
    Qian, Quan
    ALGORITHMS, 2022, 15 (05)