Multi-class WHMBoost: An ensemble algorithm for multi-class imbalanced data

被引:1
|
作者
Zhao, Jiakun [1 ]
Jin, Ju [1 ]
Zhang, Yibo [1 ]
Zhang, Ruifeng [1 ]
Chen, Si [1 ]
机构
[1] Xi An Jiao Tong Univ, Sch Software Engn, Xian, Shaanxi, Peoples R China
关键词
multi-class; imbalanced data; ensemble method; random balance based on average size; CLASSIFICATION;
D O I
10.3233/IDA-215874
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The imbalanced data problem is widespread in the real world. In the process of training machine learning models, ignoring imbalanced data problems will cause the performance of the model to deteriorate. At present, researchers have proposed many methods to deal with the imbalanced data problems, but these methods mainly focus on the imbalanced data problems in two-class classification tasks. Learning from multi-class imbalanced data sets is still an open problem. In this paper, an ensemble method for classifying multi-class imbalanced data sets is put forward, called multi-class WHMBoost. It is an extension of WHMBoost that we proposed earlier. We do not use the algorithm used in WHMBoost to process the data, but use random balance based on average size so as to balance the data distribution. The weak classifiers we use in the boosting algorithm are support vector machine and decision tree classifier. In the process of training the model, they participate in training with given weights in order to complement each other's advantages. On 18 multi-class imbalanced data sets, we compared the performance of multi-class WHMBoost with state of the art ensemble algorithms using MAUC, MG-mean and MMCC as evaluation criteria. The results demonstrate that it has obvious advantages compared with state of the art ensemble algorithms and can effectively deal with multi-class imbalanced data sets.
引用
收藏
页码:599 / 614
页数:16
相关论文
共 50 条
  • [1] An online ensemble classification algorithm for multi-class imbalanced data stream
    Han, Meng
    Li, Chunpeng
    Meng, Fanxing
    He, Feifei
    Zhang, Ruihua
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (11) : 6845 - 6880
  • [2] An Algorithm for Selective Preprocessing of Multi-class Imbalanced Data
    Wojciechowski, Szymon
    Wilk, Szymon
    Stefanowski, Jerzy
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON COMPUTER RECOGNITION SYSTEMS CORES 2017, 2018, 578 : 238 - 247
  • [3] A Novel Double Ensemble Algorithm for the Classification of Multi-Class Imbalanced Hyperspectral Data
    Quan, Daying
    Feng, Wei
    Dauphin, Gabriel
    Wang, Xiaofeng
    Huang, Wenjiang
    Xing, Mengdao
    REMOTE SENSING, 2022, 14 (15)
  • [4] Multi-class Boosting for Imbalanced Data
    Fernandez-Baldera, Antonio
    Buenaposada, Jose M.
    Baumela, Luis
    PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2015), 2015, 9117 : 57 - 64
  • [5] Multi-class Ensemble Learning of Imbalanced Bidding Fraud Data
    Anowar, Farzana
    Sadaoui, Samira
    ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11489 : 352 - 358
  • [6] Dynamic ensemble selection for multi-class imbalanced datasets
    Garcia, Salvador
    Zhang, Zhong-Liang
    Altalhi, Abdulrahman
    Alshomrani, Saleh
    Herrera, Francisco
    INFORMATION SCIENCES, 2018, 445 : 22 - 37
  • [7] Evaluating Difficulty of Multi-class Imbalanced Data
    Lango, Mateusz
    Napierala, Krystyna
    Stefanowski, Jerzy
    FOUNDATIONS OF INTELLIGENT SYSTEMS, ISMIS 2017, 2017, 10352 : 312 - 322
  • [8] An Effective Ensemble Method for Multi-class Classification and Regression for Imbalanced Data
    Alam, Tahira
    Ahmed, Chowdhury Farhan
    Zahin, Sabit Anwar
    Khan, Muhammad Asif Hossain
    Islam, Maliha Tashfia
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS (ICDM 2018), 2018, 10933 : 59 - 74
  • [9] Survey on Highly Imbalanced Multi-class Data
    Hamid, Hakim Abdul
    Yusoff, Marina
    Mohamed, Azlinah
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (06) : 211 - 229
  • [10] SCALA: Scaling algorithm for multi-class imbalanced classification A novel algorithm specifically designed for multi-class multiple minority imbalanced data problems.
    Barzinji, Ala O.
    Ma, Jixin
    Ma, Chaoying
    PROCEEDINGS OF 2023 8TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING TECHNOLOGIES, ICMLT 2023, 2023, : 68 - 73