CCO: A Cluster Core-Based Oversampling Technique for Improved Class-Imbalanced Learning

被引：4

作者：

Mondal, Priyobrata ^{[1
]}

Ansari, Faizanuddin ^{[1
]}

Das, Swagatam ^{[1
]}

机构：

[1] Indian Stat Inst, Elect & Commun Sci Unit, Kolkata 700108, India

来源：

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2024年

关键词：

Clustering algorithms; Noise measurement; Interpolation; Noise; Computational intelligence; Classification algorithms; Task analysis; Classification; imbalanced data; oversampling; synthetic minority oversampling technique; MEAN SHIFT; K-MEANS; SMOTE;

D O I：

10.1109/TETCI.2024.3407784

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Supervised classification problems from the real world typically face a challenge characterized by the scarcity of samples in one or more target classes compared to the rest of the majority classes. In response to such class imbalance, we propose an oversampling technique based on clustering, aiming to populate the minority class with synthetic samples. This approach capitalizes on the notion of "Cluster Cores," representing locally dense regions within clusters. These Cluster Cores act as central, densely crowded areas that capture intricate topological properties of the corresponding clusters, especially in complex datasets with a non-convex spatial orientation in the feature space. By concentrating on these high-density regions, our clustering-based oversampling technique generates synthetic samples within the convex hull region of minority class instances in the formed clusters. This strategy ensures the creation of points that align with the data space and considers each minority instance within a specific cluster, thereby averting the problems encountered due to the generation of artificial samples by mere linear combination of the minority class data points, as is encountered in SMOTE (Synthetic Minority Oversampling Technique)-based algorithms. To assess the efficacy of our proposal, we conducted experimental comparisons against several cutting-edge algorithms, considering an array of evaluation metrics on well-known datasets used in the literature for both binary and multi-class classification. Additionally, we undertook a detailed ablation study, scrutinized existing algorithms in our context, delineated their strengths and limitations, and contemplated potential research directions in this domain.

引用

页码：1 / 13

页数：13

共 50 条

[1] Weight Decision Algorithm for Oversampling Technique on Class-Imbalanced Learning
Kang, Young-Il
Won, Sangchul
INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2010), 2010, : 182 - 186
[2] A novel oversampling technique for class-imbalanced learning based on SMOTE and natural neighbors
Li, Junnan
Zhu, Qingsheng
Wu, Quanwang
Fan, Zhu
INFORMATION SCIENCES, 2021, 565 : 438 - 455
[3] Learning class-imbalanced data with region-impurity synthetic minority oversampling technique
Li, Der -Chiang
Wang, Ssu-Yang
Huang, Kuan-Cheng
Tsai, Tung -, I
INFORMATION SCIENCES, 2022, 607 : 1391 - 1407
[4] A Weakly Supervised Learning-Based Oversampling Framework for Class-Imbalanced Fault Diagnosis
Qian, Min
Li, Yan-Fu
IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (01) : 429 - 442
[5] Oversampling adversarial network for class-imbalanced fault diagnosis
Zareapoor, Masoumeh
Shamsolmoali, Pourya
Yang, Jie
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2021, 149
[6] Class-Imbalanced Voice Pathology Detection and Classification Using Fuzzy Cluster Oversampling Method
Fan, Ziqi
Wu, Yuanbo
Zhou, Changwei
Zhang, Xiaojun
Tao, Zhi
APPLIED SCIENCES-BASEL, 2021, 11 (08):
[7] Synthetic minority oversampling technique based on natural neighborhood graph with subgraph cores for class-imbalanced classification
Zhao, Ming
JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
[8] A novel graph oversampling framework for node classification in class-imbalanced graphs
Xia, Riting
Zhang, Chunxu
Zhang, Yan
Liu, Xueyan
Yang, Bo
SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (06)
[9] A novel graph oversampling framework for node classification in class-imbalanced graphs
Riting XIA
Chunxu ZHANG
Yan ZHANG
Xueyan LIU
Bo YANG
Science China(Information Sciences), 2024, 67 (06) : 214 - 229
[10] A genetic algorithm-based approach for class-imbalanced learning
Dong, Shangyan
Wu, Yongcheng
THIRD INTERNATIONAL WORKSHOP ON PATTERN RECOGNITION, 2018, 10828

← 1 2 3 4 5 →