Novel fuzzy clustering-based undersampling framework for class imbalance problem

被引:2
|
作者
Pratap, Vibha [1 ,2 ]
Singh, Amit Prakash [1 ]
机构
[1] Guru Gobind Singh Indraprastha Univ, USICT, New Delhi, India
[2] Indira Gandhi Delhi Tech Univ Women, Delhi, India
关键词
Class imbalance; Ensemble method; Fuzzy C-mean; Machine learning; Oversampling; Under-sampling; CLASSIFICATION; PREDICTION; SMOTE;
D O I
10.1007/s13198-023-01897-1
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The class imbalance problem occurs in various real-world datasets. Although it is considered that samples of the classes of a dataset are evenly distributed, in many cases, datasets are highly imbalanced. Classification of such datasets is challenging in machine learning. Researchers have developed many approaches to solve the class imbalance problem, such as resampling and ensemble methods. In resampling methods, minority class samples are increased (oversampling), or majority class samples are reduced (under-sampling). In contrast, the ensemble methods classify various subsets of data where classification results are combined to provide the final result. The authors have introduced a new fuzzy C-mean clustering-based under-sampling method in the present study. We performed experiments using newly proposed method over 30 small-scale imbalanced datasets. The results obtained revealed that the proposed method improves the classification performance. The average sensitivity improved by 1% and the average balance accuracy improved by 3% as compared to k-means undersampling method. The results of this study would be useful in classification of imbalanced datasets of various domains.
引用
收藏
页码:967 / 976
页数:10
相关论文
共 50 条
  • [21] A new framework to deal with the class imbalance problem in urban gain modeling based on clustering and ensemble models
    Ahmadlou, Mohammad
    Karimi, Mohammad
    Pontius, Robert Gilmore, Jr.
    GEOCARTO INTERNATIONAL, 2022, 37 (19) : 5669 - 5692
  • [22] Survey of Fuzzy based techniques to address Class Imbalance Problem
    Kaur, Prahhjot
    Gupta, Anshul
    PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 2602 - 2604
  • [23] A clustering-based method for fuzzy modeling
    Wong, CC
    Chen, CC
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1999, E82D (06) : 1058 - 1065
  • [24] Clustering-based method for fuzzy modeling
    Tamkang Univ, Taipei Hsien, Taiwan
    IEICE Trans Inf Syst, 6 (1058-1065):
  • [25] A Novel Fuzzy Clustering-Based Histogram Model for Image Contrast Enhancement
    Bhandari, Ashish Kumar
    Shahnawazuddin, Syed
    Meena, Ayur Kumar
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2020, 28 (09) : 2009 - 2021
  • [26] A fuzzy clustering-based hybrid method for a multi-facility location problem
    Şakir Esnaf
    Tarık Küçükdeniz
    Journal of Intelligent Manufacturing, 2009, 20 : 259 - 265
  • [27] A fuzzy clustering-based hybrid method for a multi-facility location problem
    Esnaf, Sakir
    Kucukdeniz, Tarik
    JOURNAL OF INTELLIGENT MANUFACTURING, 2009, 20 (02) : 259 - 265
  • [28] A Boosting-Aided Adaptive Cluster-Based Undersampling Approach for Treatment of Class Imbalance Problem
    Devi, Debashree
    Namasudra, Suyel
    Kadry, Seifedine
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2020, 16 (03) : 60 - 86
  • [29] EUSC: A clustering-based surrogate model to accelerate evolutionary undersampling in imbalanced classification
    Hoang Lam Le
    Landa-Silva, Dario
    Galar, Mikel
    Garcia, Salvador
    Triguero, Isaac
    APPLIED SOFT COMPUTING, 2021, 101
  • [30] A clustering-based adaptive undersampling ensemble method for highly unbalanced data classification
    Yuan, Xiaohan
    Sun, Chuan
    Chen, Shuyu
    APPLIED SOFT COMPUTING, 2024, 159