A survey of multi-class imbalanced data classification methods

被引:4
|
作者
Han, Meng [1 ]
Li, Ang [1 ]
Gao, Zhihui [1 ]
Mu, Dongliang [1 ]
Liu, Shujuan [1 ]
机构
[1] North Minzu Univ, Sch Comp Sci & Engn, Yinchuan, Ningxia, Peoples R China
关键词
Classification; multi-class imbalance data; data preprocessing method; algorithm-level classification method; EXTREME LEARNING-MACHINE; SELECTION; ALGORITHM; CNN;
D O I
10.3233/JIFS-221902
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In reality, the data generated in many fields are often imbalanced, such as fraud detection, network intrusion detection and disease diagnosis. The class with fewer instances in the data is called the minority class, and the minority class in some applications contains the significant information. So far, many classification methods and strategies for binary imbalanced data have been proposed, but there are still many problems and challenges in multi-class imbalanced data that need to be solved urgently. The classification methods for multi-class imbalanced data are analyzed and summarized in terms of data preprocessing methods and algorithm-level classification methods, and the performance of the algorithms using the same dataset is compared separately. In the data preprocessing methods, the methods of oversampling, under-sampling, hybrid sampling and feature selection are mainly introduced. Algorithm-level classification methods are comprehensively introduced in four aspects: ensemble learning, neural network, support vector machine and multi-class decomposition technique. At the same time, all data preprocessing methods and algorithm-level classification methods are analyzed in detail in terms of the techniques used, comparison algorithms, pros and cons, respectively. Moreover, the evaluation metrics commonly used for multi-class imbalanced data classification methods are described comprehensively. Finally, the future directions of multi-class imbalanced data classification are given.
引用
收藏
页码:2471 / 2501
页数:31
相关论文
共 50 条
  • [1] Boosting methods for multi-class imbalanced data classification: an experimental review
    Jafar Tanha
    Yousef Abdi
    Negin Samadi
    Nazila Razzaghi
    Mohammad Asadpour
    Journal of Big Data, 7
  • [2] Boosting methods for multi-class imbalanced data classification: an experimental review
    Tanha, Jafar
    Abdi, Yousef
    Samadi, Negin
    Razzaghi, Nazila
    Asadpour, Mohammad
    JOURNAL OF BIG DATA, 2020, 7 (01)
  • [3] Survey on Highly Imbalanced Multi-class Data
    Hamid, Hakim Abdul
    Yusoff, Marina
    Mohamed, Azlinah
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (06) : 211 - 229
  • [4] Multi-class imbalanced big data classification on Spark
    Sleeman, William C.
    Krawczyk, Bartosz
    KNOWLEDGE-BASED SYSTEMS, 2021, 212
  • [5] A Combination Method for Multi-Class Imbalanced Data Classification
    Li, Hu
    Zou, Peng
    Han, Weihong
    Xia, Rongze
    2013 10TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA 2013), 2013, : 365 - 368
  • [6] Classification of Multi-class Imbalanced Data: Data Difficulty Factors and Selected Methods for Improving Classifiers
    Stefanowski, Jerzy
    ROUGH SETS (IJCRS 2021), 2021, 12872 : 57 - 72
  • [7] Selecting local ensembles for multi-class imbalanced data classification
    Krawczyk, Bartosz
    Cano, Alberto
    Wozniak, Michal
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [8] Undersampling with Support Vectors for Multi-Class Imbalanced Data Classification
    Krawczyk, Bartosz
    Bellinger, Colin
    Corizzo, Roberto
    Japkowicz, Nathalie
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [9] Multi-class Boosting for Imbalanced Data
    Fernandez-Baldera, Antonio
    Buenaposada, Jose M.
    Baumela, Luis
    PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2015), 2015, 9117 : 57 - 64
  • [10] Multi-class WHMBoost: An ensemble algorithm for multi-class imbalanced data
    Zhao, Jiakun
    Jin, Ju
    Zhang, Yibo
    Zhang, Ruifeng
    Chen, Si
    INTELLIGENT DATA ANALYSIS, 2022, 26 (03) : 599 - 614