A Hybrid Approach Handling Imbalanced Datasets

被引:0
|
作者
Soda, Paolo [1 ]
机构
[1] Univ Campus Biomed Rome, Integrated Res Ctr, Med Informat & Comp Sci Lab, Rome, Italy
关键词
STRATEGIES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Several binary classification problems exhibit imbalance in class distribution, influencing system learning. Indeed, traditional machine learning algorithms are hi sod towards the majority class, thus producing poor predictive accuracy Over the minority One. To overcome this limitation: many approaches have been proposed up to now to build artificially balanced training sets. Further to their specific drawbacks, they achieve more balanced accuracies on each class harming the global accuracy. This paper first reviews the more recent method coping with Unbalanced datasets and then proposes a strategy overcoming the main drawbacks of existing approaches. It is based on an ensemble of classifiers trained on balanced subsets of the original Unbalanced training set working in conjunction with the classifier trained on the original Unbalanced dataset. The performance of the method has been estimated on six public datasets, proving its effectiveness also in comparison with other approaches. It also gives the chance to modify the system behaviour according to the operating scenario.
引用
收藏
页码:209 / 218
页数:10
相关论文
共 50 条
  • [21] ARCID: A New Approach to Deal with Imbalanced Datasets Classification
    Abdellatif, Safa
    Ben Hassine, Mohamed Ali
    Ben Yahia, Sadok
    Bouzeghoub, Amel
    SOFSEM 2018: THEORY AND PRACTICE OF COMPUTER SCIENCE, 2018, 10706 : 569 - 580
  • [22] A GENETIC RULE LEARNING APPROACH TO DEAL WITH IMBALANCED DATASETS
    Mahani, Aouatef
    Benkhider, Sadjia
    Baba-Ali, Ahmed Riadh
    PROCEEDINGS OF THE EUROPEAN CONFERENCE ON DATA MINING 2015 AND INTERNATIONAL CONFERENCES ON INTELLIGENT SYSTEMS AND AGENTS 2015 AND THEORY AND PRACTICE IN MODERN COMPUTING 2015, 2015, : 151 - 156
  • [23] A hybrid stacking classifier with feature selection for handling imbalanced data
    Abraham A.
    Kayalvizhi R.
    Mohideen H.S.
    Journal of Intelligent and Fuzzy Systems, 2024, 46 (04): : 9103 - 9117
  • [24] Soft Margin SVM Modeling for Handling Imbalanced Human Activity Datasets in Multiple Homes
    Abidine, M'hamed Bilal
    Yala, Nawel
    Fergani, Belkacem
    Clavier, Laurent
    2014 INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS (ICMCS), 2014, : 421 - 426
  • [25] Hybrid Undersampling and Oversampling for Handling Imbalanced Credit Card Data
    Alamri, Maram
    Ykhlef, Mourad
    IEEE ACCESS, 2024, 12 : 14050 - 14060
  • [26] AWGAN: An adaptive weighting GAN approach for oversampling imbalanced datasets
    Guan, Shaopeng
    Zhao, Xiaoyan
    Xue, Yuewei
    Pan, Hao
    INFORMATION SCIENCES, 2024, 663
  • [27] An efficient classification approach in imbalanced datasets for intrinsic plagiarism detection
    Andrianna Polydouri
    Eleni Vathi
    Georgios Siolas
    Andreas Stafylopatis
    Evolving Systems, 2020, 11 : 503 - 515
  • [28] An efficient classification approach in imbalanced datasets for intrinsic plagiarism detection
    Polydouri, Andrianna
    Vathi, Eleni
    Siolas, Georgios
    Stafylopatis, Andreas
    EVOLVING SYSTEMS, 2020, 11 (03) : 503 - 515
  • [29] Data-Centric Optimization Approach for Small, Imbalanced Datasets
    Tanov, Vladislav
    JOURNAL OF INFORMATION AND ORGANIZATIONAL SCIENCES, 2023, 47 (01) : 167 - 177
  • [30] Combination Approach of SMOTE and Biased-SVM for Imbalanced Datasets
    Wang He-Yong
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 228 - 231