Analysis of imbalanced data using cost-sensitive learning

被引:0
|
作者
Kim, Sojin [1 ]
Song, Jongwoo [1 ]
机构
[1] Ewha Womans Univ, Dept Stat, Seoul, South Korea
关键词
Imbalanced classification; cost-sensitive learning; classification performance; hybrid classification; SMOTE;
D O I
10.1080/03610926.2025.2472792
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Typically, classification algorithms strive to maximize the accuracy. However, when dealing with significantly imbalanced data, accuracy may not be the most suitable metric. We believe that the most effective approach for handling imbalanced cases is to minimize the total costs. Unfortunately, precise costs for misclassification are often unavailable in real-world scenarios. To address this problem, we offer a simple and efficient search algorithm for cost-sensitive learning. We also introduce a new performance metric, imbalanced data classification performance (IDCP), which combines the F-score and the area under the curve (AUC). By utilizing the imbalance ratio (IR) as a crucial factor, we use IDCP to determine optimal weights in cost-sensitive learning. Through extensive experiments, we show that our method can find the optimal decision boundary in imbalanced datasets. Our code is available at https://github.com/sssojin/Imbalanced_Classification
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Cost-sensitive learning for imbalanced data streams
    Loezer, Lucas
    Enembreck, Fabricio
    Barddal, Jean Paul
    Britto Jr, Alceu de Souza
    PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 498 - 504
  • [2] Cost-Sensitive Learning Methods for Imbalanced Data
    Nguyen Thai-Nghe
    Gantner, Zeno
    Schmidt-Thieme, Lars
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [3] Cost-sensitive learning for imbalanced medical data: a review
    Araf, Imane
    Idri, Ali
    Chairi, Ikram
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (04)
  • [4] On the Role of Cost-Sensitive Learning in Imbalanced Data Oversampling
    Krawczyk, Bartosz
    Wozniak, Michal
    COMPUTATIONAL SCIENCE - ICCS 2019, PT III, 2019, 11538 : 180 - 191
  • [5] Cost-sensitive learning for imbalanced medical data: a review
    Imane Araf
    Ali Idri
    Ikram Chairi
    Artificial Intelligence Review, 57
  • [6] Cost-Sensitive Learning based on Performance Metric for Imbalanced Data
    Aurelio, Yuri Sousa
    de Almeida, Gustavo Matheus
    de Castro, Cristiano Leite
    Braga, Antonio Padua
    NEURAL PROCESSING LETTERS, 2022, 54 (04) : 3097 - 3114
  • [7] Cost-Sensitive Learning based on Performance Metric for Imbalanced Data
    Yuri Sousa Aurelio
    Gustavo Matheus de Almeida
    Cristiano Leite de Castro
    Antonio Padua Braga
    Neural Processing Letters, 2022, 54 : 3097 - 3114
  • [8] Cost-sensitive learning using logical analysis of data
    Osman, Hany
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (06) : 3571 - 3606
  • [9] Cost-Sensitive Learning of Deep Feature Representations From Imbalanced Data
    Khan, Salman H.
    Hayat, Munawar
    Bennamoun, Mohammed
    Sohel, Ferdous A.
    Togneri, Roberto
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (08) : 3573 - 3587
  • [10] Cost-sensitive design of quadratic discriminant analysis for imbalanced data
    Bejaoui, Amine
    Elkhalil, Khalil
    Kammoun, Abla
    Alouini, Mohamed-Slim
    Al-Naffouri, Tareq
    PATTERN RECOGNITION LETTERS, 2021, 149 : 24 - 29