Sampling plus Reweighting: Boosting the Performance of AdaBoost on Imbalanced Datasets

被引:0
|
作者
Yuan, Bo [1 ]
Ma, Xiaoli [1 ]
机构
[1] Tsinghua Univ, Intelligent Comp Lab, Div Informat, Grad Sch Shenzhen, Shenzhen 518055, Peoples R China
来源
2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2012年
关键词
Class Imbalance Learning; GAs; AdaBoost; SMOTE; ENSEMBLES; DIVERSITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing attempts to improve the performance of AdaBoost on imbalanced datasets have largely been focused on modifying its weight updating rule or incorporating sampling or cost sensitive learning techniques. In this paper, we propose to tackle the challenge from a novel perspective. Initially, the dataset is over-sampled and the standard AdaBoost is applied to create a series of base classifiers. Next, the weights of the classifiers are further retrained by Genetic Algorithms (GAs) or comparable optimization techniques where more targeted performance measures such as G-mean and F-measure can be directly used as the objective function. Consequently, unlike other indirect solutions, this sampling + reweighting strategy can purposefully tune AdaBoost towards a certain performance measure of interest with only moderate computational overhead. Experimental results on ten benchmark datasets show that this strategy can reliably boost the performance of AdaBoost and has consistent superiority over EasyEnsemble, which is a competent ensemble method for class imbalance learning.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Experimental Comparison of Sampling Techniques for Imbalanced Datasets Using Various Classification Models
    Pattanayak, Sanjibani Sudha
    Rout, Minakhi
    PROGRESS IN ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, VOL 2, 2018, 564 : 13 - 22
  • [32] A Hybrid Sampling Method Based on Safe Screening for Imbalanced Datasets with Sparse Structure
    Shi, Hongbo
    Gao, Qigang
    Ji, Suqin
    Liu, Yanxin
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [33] Machine Learning with Imbalanced EEG Datasets using Outlier-based Sampling
    Islah, Nizar
    Koerner, Jamie
    Genov, Roman
    Valiante, Taufik A.
    O'Leary, Gerard
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 112 - 115
  • [34] A Novel Evolutionary Preprocessing Method Based on Over-sampling and Under-sampling for Imbalanced Datasets
    Wong, Ginny Y.
    Leung, Frank H. F.
    Ling, Sai-Ho
    39TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY (IECON 2013), 2013, : 2354 - 2359
  • [35] OUBoost: boosting based over and under sampling technique for handling imbalanced data
    Mostafaei, Sahar Hassanzadeh
    Tanha, Jafar
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (10) : 3393 - 3411
  • [36] Boosting association rule mining in large datasets via Gibbs sampling
    Qian, Guoqi
    Rao, Calyampudi Radhakrishna
    Sun, Xiaoying
    Wu, Yuehua
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (18) : 4958 - 4963
  • [37] Classification of imbalanced ECG beats using re-sampling techniques and AdaBoost ensemble classifier
    Rajesh, Kandala N. V. P. S.
    Dhuli, Ravindra
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2018, 41 : 242 - 254
  • [38] Early Fault Detection in Induction Motors Using AdaBoost With Imbalanced Small Data and Optimized Sampling
    Martin-Diaz, Ignacio
    Morinigo-Sotelo, Daniel
    Duque-Perez, Oscar
    Romero-Troncoso, Rene de J.
    IEEE TRANSACTIONS ON INDUSTRY APPLICATIONS, 2017, 53 (03) : 3066 - 3075
  • [39] OUBoost: boosting based over and under sampling technique for handling imbalanced data
    Sahar Hassanzadeh Mostafaei
    Jafar Tanha
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 3393 - 3411
  • [40] CUSBoost: Cluster-based Under-sampling with Boosting for Imbalanced Classification
    Rayhan, Farshid
    Ahmed, Sajid
    Mahbub, Asif
    Jani, Md. Rafsan
    Shatabda, Swakkhar
    Farid, Dewan Md.
    2017 2ND INTERNATIONAL CONFERENCE ON COMPUTATIONAL SYSTEMS AND INFORMATION TECHNOLOGY FOR SUSTAINABLE SOLUTION (CSITSS-2017), 2017, : 70 - 75