Sampling plus Reweighting: Boosting the Performance of AdaBoost on Imbalanced Datasets

被引:0
|
作者
Yuan, Bo [1 ]
Ma, Xiaoli [1 ]
机构
[1] Tsinghua Univ, Intelligent Comp Lab, Div Informat, Grad Sch Shenzhen, Shenzhen 518055, Peoples R China
来源
2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2012年
关键词
Class Imbalance Learning; GAs; AdaBoost; SMOTE; ENSEMBLES; DIVERSITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing attempts to improve the performance of AdaBoost on imbalanced datasets have largely been focused on modifying its weight updating rule or incorporating sampling or cost sensitive learning techniques. In this paper, we propose to tackle the challenge from a novel perspective. Initially, the dataset is over-sampled and the standard AdaBoost is applied to create a series of base classifiers. Next, the weights of the classifiers are further retrained by Genetic Algorithms (GAs) or comparable optimization techniques where more targeted performance measures such as G-mean and F-measure can be directly used as the objective function. Consequently, unlike other indirect solutions, this sampling + reweighting strategy can purposefully tune AdaBoost towards a certain performance measure of interest with only moderate computational overhead. Experimental results on ten benchmark datasets show that this strategy can reliably boost the performance of AdaBoost and has consistent superiority over EasyEnsemble, which is a competent ensemble method for class imbalance learning.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Performance Evaluations of Supervised Learners on Imbalanced Datasets
    Bulut, Faruk
    2016 ELECTRIC ELECTRONICS, COMPUTER SCIENCE, BIOMEDICAL ENGINEERINGS' MEETING (EBBT), 2016,
  • [22] Performance Evaluation of Hybrid PSO-BPNN-AdaBoost and PSO-BPNN-XGBoost Models for Rockburst Prediction with Imbalanced Datasets
    Li, Shujian
    Lu, Pengpeng
    Liang, Weizhang
    Chen, Ying
    Da, Qi
    APPLIED SCIENCES-BASEL, 2024, 14 (24):
  • [23] Combining integrated sampling with SVM ensembles for learning from imbalanced datasets
    Liu, Yang
    Yu, Xiaohui
    Huang, Jimmy Xiangji
    An, Aijun
    INFORMATION PROCESSING & MANAGEMENT, 2011, 47 (04) : 617 - 631
  • [24] Gaussian Sampling Approach to deal with Imbalanced Telemetry Datasets in Industrial Applications
    Galve, Sergio
    Puig, Vicenc
    Vilajosana, Xavi
    2023 31ST MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION, MED, 2023, : 605 - 611
  • [25] Handling imbalanced datasets by partially guided hybrid sampling for pattern recognition
    Sandhan, Tushar
    Choi, Jin Young
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 1449 - 1453
  • [26] Cluster-Based Minority Over-Sampling for Imbalanced Datasets
    Puntumapon, Kamthorn
    Rakthamamon, Thanawin
    Waiyamai, Kitsana
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (12): : 3101 - 3109
  • [27] Impact of Imbalanced Datasets Preprocessing in the Performance of Associative Classifiers
    Rangel-Diaz-de-la-Vega, Adolfo
    Villuendas-Rey, Yenny
    Yanez-Marquez, Cornelio
    Camacho-Nieto, Oscar
    Lopez-Yanez, Itzama
    APPLIED SCIENCES-BASEL, 2020, 10 (08):
  • [28] Oversampling for Mining Imbalanced Datasets: Taxonomy and Performance Evaluation
    Jedrzejowicz, Piotr
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 322 - 333
  • [29] PERFORMANCE EVALUATION OF DECISION TREE CLASSIFIERS AND ADABOOST ON CANCER DATASETS
    Hasan, Abid
    2011 INTERNATIONAL CONFERENCE ON COMPUTER AND COMPUTATIONAL INTELLIGENCE (ICCCI 2011), 2012, : 155 - 160
  • [30] Gradient Deep Learning Boosting and Its Application on the Imbalanced Datasets Containing Noises in Manufacturing
    Duc-Khanh Nguyen
    Chan, Chien-Lung
    Dinh-Van Phan
    2021 INTERNATIONAL CONFERENCE ON SECURITY AND INFORMATION TECHNOLOGIES WITH AI, INTERNET COMPUTING AND BIG-DATA APPLICATIONS, 2023, 314 : 225 - 235