CCO: A Cluster Core-Based Oversampling Technique for Improved Class-Imbalanced Learning

被引:4
|
作者
Mondal, Priyobrata [1 ]
Ansari, Faizanuddin [1 ]
Das, Swagatam [1 ]
机构
[1] Indian Stat Inst, Elect & Commun Sci Unit, Kolkata 700108, India
关键词
Clustering algorithms; Noise measurement; Interpolation; Noise; Computational intelligence; Classification algorithms; Task analysis; Classification; imbalanced data; oversampling; synthetic minority oversampling technique; MEAN SHIFT; K-MEANS; SMOTE;
D O I
10.1109/TETCI.2024.3407784
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Supervised classification problems from the real world typically face a challenge characterized by the scarcity of samples in one or more target classes compared to the rest of the majority classes. In response to such class imbalance, we propose an oversampling technique based on clustering, aiming to populate the minority class with synthetic samples. This approach capitalizes on the notion of "Cluster Cores," representing locally dense regions within clusters. These Cluster Cores act as central, densely crowded areas that capture intricate topological properties of the corresponding clusters, especially in complex datasets with a non-convex spatial orientation in the feature space. By concentrating on these high-density regions, our clustering-based oversampling technique generates synthetic samples within the convex hull region of minority class instances in the formed clusters. This strategy ensures the creation of points that align with the data space and considers each minority instance within a specific cluster, thereby averting the problems encountered due to the generation of artificial samples by mere linear combination of the minority class data points, as is encountered in SMOTE (Synthetic Minority Oversampling Technique)-based algorithms. To assess the efficacy of our proposal, we conducted experimental comparisons against several cutting-edge algorithms, considering an array of evaluation metrics on well-known datasets used in the literature for both binary and multi-class classification. Additionally, we undertook a detailed ablation study, scrutinized existing algorithms in our context, delineated their strengths and limitations, and contemplated potential research directions in this domain.
引用
收藏
页码:1 / 13
页数:13
相关论文
共 50 条
  • [1] Weight Decision Algorithm for Oversampling Technique on Class-Imbalanced Learning
    Kang, Young-Il
    Won, Sangchul
    INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2010), 2010, : 182 - 186
  • [2] A novel oversampling technique for class-imbalanced learning based on SMOTE and natural neighbors
    Li, Junnan
    Zhu, Qingsheng
    Wu, Quanwang
    Fan, Zhu
    INFORMATION SCIENCES, 2021, 565 : 438 - 455
  • [3] Learning class-imbalanced data with region-impurity synthetic minority oversampling technique
    Li, Der -Chiang
    Wang, Ssu-Yang
    Huang, Kuan-Cheng
    Tsai, Tung -, I
    INFORMATION SCIENCES, 2022, 607 : 1391 - 1407
  • [4] A Weakly Supervised Learning-Based Oversampling Framework for Class-Imbalanced Fault Diagnosis
    Qian, Min
    Li, Yan-Fu
    IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (01) : 429 - 442
  • [5] Oversampling adversarial network for class-imbalanced fault diagnosis
    Zareapoor, Masoumeh
    Shamsolmoali, Pourya
    Yang, Jie
    MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2021, 149
  • [6] Class-Imbalanced Voice Pathology Detection and Classification Using Fuzzy Cluster Oversampling Method
    Fan, Ziqi
    Wu, Yuanbo
    Zhou, Changwei
    Zhang, Xiaojun
    Tao, Zhi
    APPLIED SCIENCES-BASEL, 2021, 11 (08):
  • [7] Synthetic minority oversampling technique based on natural neighborhood graph with subgraph cores for class-imbalanced classification
    Zhao, Ming
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
  • [8] A novel graph oversampling framework for node classification in class-imbalanced graphs
    Xia, Riting
    Zhang, Chunxu
    Zhang, Yan
    Liu, Xueyan
    Yang, Bo
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (06)
  • [9] A novel graph oversampling framework for node classification in class-imbalanced graphs
    Riting XIA
    Chunxu ZHANG
    Yan ZHANG
    Xueyan LIU
    Bo YANG
    Science China(Information Sciences), 2024, 67 (06) : 214 - 229
  • [10] A genetic algorithm-based approach for class-imbalanced learning
    Dong, Shangyan
    Wu, Yongcheng
    THIRD INTERNATIONAL WORKSHOP ON PATTERN RECOGNITION, 2018, 10828