A robust multi-class AdaBoost algorithm for mislabeled noisy data

被引：61

作者：

Sun, Bo ^{[1
]}

Chen, Songcan ^{[1
]}

Wang, Jiandong ^{[1
]}

Chen, Haiyan ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, 29 Yudao St, Nanjing 210016, Jiangsu, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2016年 / 102卷

基金：

中国国家自然科学基金;

关键词：

Ensemble learning; AdaBoost; Robustness; Multi-class classification; Mislabeled noise; CLASSIFICATION; SETS;

D O I：

10.1016/j.knosys.2016.03.024

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

AdaBoost has been theoretically and empirically proved to be a very successful ensemble learning algorithm, which iteratively generates a set of diverse weak learners and combines their outputs using the weighted majority voting rule as the final decision. However, in some cases, AdaBoost leads to overfitting especially for mislabeled noisy training examples, resulting in both its degraded generalization performance and non-robustness. Recently, a representative approach named noise-detection based AdaBoost (ND_AdaBoost) has been proposed to improve the robustness of AdaBoost in the two-class classification scenario, however, in the multi-class scenario, this approach can hardly achieve satisfactory performance due to the following three reasons. (1) If we decompose a multi-class classification problem using such strategies as one-versus-all or one-versus-one, the obtained two-class problems usually have imbalanced training sets, which negatively influences the performance of ND_AdaBoost (2) If we directly apply ND_AdaBoost to the multi-class classification scenario, its two-class loss function is no longer applicable and its accuracy requirement for the (weak) base classifiers, i.e., greater than 0.5, is too strong to be almost satisfied. (3) ND_AdaBoost still has the tendency of overfitting as it increases the weights of correctly classified noisy examples, which could make it focus on learning these noisy examples in the subsequent iterations. To solve the dilemma, in this paper, we propose a robust multi-class AdaBoost algorithm (Rob_MulAda) whose key ingredients consist in a noise-detection based multi-class loss function and a new weight updating scheme. Experimental study indicates that our newly-proposed weight updating scheme is indeed more robust to mislabeled noises than that of ND_AdaBoost in both two -class and multi -class scenarios. In addition, through the comparison experiments, we also verify the effectiveness of Rob_MulAda and provide a suggestion in choosing the most appropriate noise-alleviating approach according to the concrete noise level in practical applications. Crown Copyright (C) 2016 Published by Elsevier B.V. All rights reserved.

引用

页码：87 / 102

页数：16

共 50 条

[21] Multi-Class Recognition using Noisy Training Data with a Self-Learning Approach
Ghahremani, Amir
Bondarev, Egor
de With, Peter H. N.
2018 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2018, : 733 - 740
[22] Two-stage multi-class AdaBoost for facial expression recognition
Deng, Hongbo
Zhu, Jianke
Lyu, Michael R.
King, Irwin
2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 3010 - 3015
[23] Nested AdaBoost procedure for classification and multi-class nonlinear discriminant analysis
Filisbino, Tiene A.
Giraldi, Gilson A.
Thomaz, Carlos E.
SOFT COMPUTING, 2020, 24 (23) : 17969 - 17990
[24] Nested AdaBoost procedure for classification and multi-class nonlinear discriminant analysis
Tiene A. Filisbino
Gilson A. Giraldi
Carlos E. Thomaz
Soft Computing, 2020, 24 : 17969 - 17990
[25] SCALA: Scaling algorithm for multi-class imbalanced classification A novel algorithm specifically designed for multi-class multiple minority imbalanced data problems.
Barzinji, Ala O.
Ma, Jixin
Ma, Chaoying
PROCEEDINGS OF 2023 8TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING TECHNOLOGIES, ICMLT 2023, 2023, : 68 - 73
[26] Multi-class extensions of the GLDB feature extraction algorithm for spectral data
Paclík, Pavel
Verzakov, Serguei
Duin, Robert P.W.
Proc. Int. Conf. Pattern Recognit., 1600, (629-632):
[27] Application of Improved Multi-class Active Learning Algorithm in Data Processing
Lin, Haiming
Huang, Yao
Yang, Dexiang
Li, Ke
Wei, Jun
2021 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE BIG DATA AND INTELLIGENT SYSTEMS (HPBD&IS), 2021, : 116 - 124
[28] OAHO: an effective algorithm for multi-class learning from imbalanced data
Murphey, Yi L.
Wang, Haoxing
Ou, Guobin
Feldkamp, Lee A.
2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 406 - +
[29] An online ensemble classification algorithm for multi-class imbalanced data stream
Han, Meng
Li, Chunpeng
Meng, Fanxing
He, Feifei
Zhang, Ruihua
KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (11) : 6845 - 6880
[30] Multi-class extensions of the GLDB feature extraction algorithm for spectral data
Paclík, P
Verzakov, S
Duin, RPW
PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, 2004, : 629 - 632

← 1 2 3 4 5 →