Sampling plus Reweighting: Boosting the Performance of AdaBoost on Imbalanced Datasets

被引:0
|
作者
Yuan, Bo [1 ]
Ma, Xiaoli [1 ]
机构
[1] Tsinghua Univ, Intelligent Comp Lab, Div Informat, Grad Sch Shenzhen, Shenzhen 518055, Peoples R China
来源
2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2012年
关键词
Class Imbalance Learning; GAs; AdaBoost; SMOTE; ENSEMBLES; DIVERSITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing attempts to improve the performance of AdaBoost on imbalanced datasets have largely been focused on modifying its weight updating rule or incorporating sampling or cost sensitive learning techniques. In this paper, we propose to tackle the challenge from a novel perspective. Initially, the dataset is over-sampled and the standard AdaBoost is applied to create a series of base classifiers. Next, the weights of the classifiers are further retrained by Genetic Algorithms (GAs) or comparable optimization techniques where more targeted performance measures such as G-mean and F-measure can be directly used as the objective function. Consequently, unlike other indirect solutions, this sampling + reweighting strategy can purposefully tune AdaBoost towards a certain performance measure of interest with only moderate computational overhead. Experimental results on ten benchmark datasets show that this strategy can reliably boost the performance of AdaBoost and has consistent superiority over EasyEnsemble, which is a competent ensemble method for class imbalance learning.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] HAR: Hardness Aware Reweighting for Imbalanced Datasets
    Duggal, Rahul
    Freitas, Scott
    Dhamnani, Sunny
    Chau, Duen Horng
    Sun, Jimeng
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 735 - 745
  • [2] Effects of the Use of Boosting on Classification Performance of Imbalanced Bioinformatics Datasets
    Khoshgoftaar, Taghi M.
    Fazelpour, Alireza
    Dittman, David J.
    Napolitano, Amri
    2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2014, : 420 - 426
  • [3] Boosting prediction accuracy on imbalanced datasets with SVM ensembles
    Liu, Yang
    An, Aijun
    Huang, Xiangji
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2006, 3918 : 107 - 118
  • [4] AdaBoost-CNN: An adaptive boosting algorithm for convolutional neural networks to classify multi -class imbalanced datasets using transfer learning
    Taherkhani, Aboozar
    Cosma, Georgina
    McGinnity, T. M.
    NEUROCOMPUTING, 2020, 404 : 351 - 366
  • [5] A Discriminative Dictionary Learning-AdaBoost-SVM Classification Method on Imbalanced Datasets
    Barstugan, Mucahid
    Ceylan, Rahime
    2017 INTERNATIONAL ARTIFICIAL INTELLIGENCE AND DATA PROCESSING SYMPOSIUM (IDAP), 2017,
  • [6] A New Hybrid Sampling Approach for Classification of Imbalanced Datasets
    Hanskunatai, Anantaporn
    PROCEEDINGS OF 2018 3RD INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS (ICCCS), 2018, : 67 - 71
  • [7] Does the Inclusion of Data Sampling Improve the Performance of Boosting Algorithms on Imbalanced Bioinformatics Data?
    Fazelpour, Alireza
    Khoshgoftaar, Taghi M.
    Dittman, David J.
    Napolitano, Amri
    2015 IEEE 14TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2015, : 527 - 534
  • [8] Boosting prediction performance on imbalanced dataset
    Zareapoor M.
    Shamsolmoali P.
    International Journal of Information and Communication Technology, 2018, 13 (02): : 186 - 195
  • [9] Weighted Conditional Mutual Information Based Boosting for Classification of Imbalanced Datasets
    Utasi, Akos
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 2711 - 2714
  • [10] Adaptations of Extreme Gradient Boosting for Imbalanced Datasets with Application in Credit Scoring
    Ferreira, Gabriel Almeida
    Suzuki, Adriano Kamimura
    SIGMAE, 2024, 13 (04): : 165 - 178