A Hybrid Approach Handling Imbalanced Datasets

被引:0
|
作者
Soda, Paolo [1 ]
机构
[1] Univ Campus Biomed Rome, Integrated Res Ctr, Med Informat & Comp Sci Lab, Rome, Italy
关键词
STRATEGIES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Several binary classification problems exhibit imbalance in class distribution, influencing system learning. Indeed, traditional machine learning algorithms are hi sod towards the majority class, thus producing poor predictive accuracy Over the minority One. To overcome this limitation: many approaches have been proposed up to now to build artificially balanced training sets. Further to their specific drawbacks, they achieve more balanced accuracies on each class harming the global accuracy. This paper first reviews the more recent method coping with Unbalanced datasets and then proposes a strategy overcoming the main drawbacks of existing approaches. It is based on an ensemble of classifiers trained on balanced subsets of the original Unbalanced training set working in conjunction with the classifier trained on the original Unbalanced dataset. The performance of the method has been estimated on six public datasets, proving its effectiveness also in comparison with other approaches. It also gives the chance to modify the system behaviour according to the operating scenario.
引用
收藏
页码:209 / 218
页数:10
相关论文
共 50 条
  • [31] Bi-SMOTE: a novel framework for handling imbalanced datasets using machine learning techniques
    Onima Tigga
    Jaya Pal
    Debjani Mustafi
    International Journal of Information Technology, 2025, 17 (1) : 431 - 445
  • [32] A dynamic time warping approach for handling class imbalanced medical datasets with missing values: A case study of protein localization site prediction
    Hung, Ling-Chien
    Hu, Ya-Han
    Tsai, Chih-Fong
    Huang, Min-Wei
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 192
  • [33] Class imbalanced data handling with cyberattack classification using Hybrid Salp Swarm Algorithm with deep learning approach
    Alabduallah, Bayan
    Maray, Mohammed
    Alruwais, Nuha
    Alabdan, Rana
    Darem, Abdulbasit A.
    Alallah, Fouad Shoie
    Alsini, Raed
    Yafoz, Ayman
    ALEXANDRIA ENGINEERING JOURNAL, 2024, 106 : 654 - 663
  • [34] A Novel Approach for Handling Imbalanced Data in Breast Cancer Dataset
    Banothu, Nagateja
    Prabu, M.
    PERVASIVE COMPUTING AND SOCIAL NETWORKING, ICPCSN 2022, 2023, 475 : 709 - 723
  • [35] Towards Effective Network Intrusion Detection in Imbalanced Datasets: A Hierarchical Approach
    Towhid, Md Shamim
    Khan, Nasik Sami
    Hasan, Md Mahibul
    Shahriar, Nashid
    2024 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS, ICNC, 2024, : 254 - 258
  • [36] Gaussian Sampling Approach to deal with Imbalanced Telemetry Datasets in Industrial Applications
    Galve, Sergio
    Puig, Vicenc
    Vilajosana, Xavi
    2023 31ST MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION, MED, 2023, : 605 - 611
  • [37] A Novel Differential Evolution-Clustering Hybrid Resampling Algorithm on Imbalanced Datasets
    Chen, Leichen
    Cai, Zhihua
    Chen, Lu
    Gu, Qiong
    THIRD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING: WKDD 2010, PROCEEDINGS, 2010, : 81 - 85
  • [38] A Hybrid Sampling Method Based on Safe Screening for Imbalanced Datasets with Sparse Structure
    Shi, Hongbo
    Gao, Qigang
    Ji, Suqin
    Liu, Yanxin
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [39] DEBOHID: A differential evolution based oversampling approach for highly imbalanced datasets
    Kaya, Ersin
    Korkmaz, Sedat
    Sahman, Mehmet Akif
    Cinar, Ahmet Cevahir
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 169
  • [40] A Multi-Objective Evolutionary Approach for Preprocessing Imbalanced Microarray Datasets
    Rangasamy, DeviPriya
    Rajappan, Sivaraj
    Natesan, Mohan
    COMPUTING IN SCIENCE & ENGINEERING, 2020, 22 (01) : 88 - 100