Synthetic minority oversampling technique based on natural neighborhood graph with subgraph cores for class-imbalanced classification

被引:0
|
作者
Zhao, Ming [1 ]
机构
[1] Chongqing Ind Polytech Coll, Mech Engn Inst, Yubei, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2025年 / 81卷 / 01期
关键词
Class-imbalanced classification; Oversampling technique; Natural neighborhood graph; Noise filter; Interpolation; SMOTE; MAJORITY; NOISY;
D O I
10.1007/s11227-024-06655-z
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The synthetic minority oversampling technique (SMOTE) has been praised by researchers in class-imbalanced classification. Although SMOTE eliminates imbalances between classes, overgeneralization and imbalances within minority classes present great challenges. Filtering-based or change-direction oversampling techniques of the SMOTE family have been developed to overcome these challenges; however, they still experience the following issues: a) many can avoid overgeneralization by removing suspicious noise or creating synthetic minority class samples in safe regions but fail to eliminate imbalances within minority classes; b) some change-direction oversampling techniques can eliminate imbalances within minority classes but cannot remove suspicious noise and have relatively high time complexity; and c) most heavily rely on more than two parameters. To overcome overgeneralization, imbalances within minority classes and the above drawbacks, this work presents an effective natural neighborhood graph-based synthetic minority oversampling technique (NaNG-SMOTE). First, a natural neighborhood graph (NaNG) is constructed on class-imbalanced data. Second, heterogeneous and homogeneous edges are defined to identify and remove suspicious noise. Third, NaNG is divided into separated subgraphs with subgraph cores, and then these subgraphs with subgraph cores of minority classes are preserved. Fourth, the sampling weight of each preserved subgraph is calculated based on the density and the number of minority class vertices. Fifth, synthetic minority class samples are created based on sampling weights and interpolation between subgraph cores and each vertex. Intensive experiments have proven that the NaNG-SMOTE outperforms 8 sophisticated oversampling techniques in improving 4 representative classifiers on synthetic or benchmark datasets from industrial applications with various imbalance ratios.
引用
收藏
页数:35
相关论文
共 50 条
  • [1] Learning class-imbalanced data with region-impurity synthetic minority oversampling technique
    Li, Der -Chiang
    Wang, Ssu-Yang
    Huang, Kuan-Cheng
    Tsai, Tung -, I
    INFORMATION SCIENCES, 2022, 607 : 1391 - 1407
  • [2] A novel graph oversampling framework for node classification in class-imbalanced graphs
    Xia, Riting
    Zhang, Chunxu
    Zhang, Yan
    Liu, Xueyan
    Yang, Bo
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (06)
  • [3] A novel graph oversampling framework for node classification in class-imbalanced graphs
    Riting XIA
    Chunxu ZHANG
    Yan ZHANG
    Xueyan LIU
    Bo YANG
    Science China(Information Sciences), 2024, 67 (06) : 214 - 229
  • [4] A novel oversampling technique for class-imbalanced learning based on SMOTE and natural neighbors
    Li, Junnan
    Zhu, Qingsheng
    Wu, Quanwang
    Fan, Zhu
    INFORMATION SCIENCES, 2021, 565 : 438 - 455
  • [5] CDSMOTE: class decomposition and synthetic minority class oversampling technique for imbalanced-data classification
    Elyan, Eyad
    Moreno-Garcia, Carlos Francisco
    Jayne, Chrisina
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (07): : 2839 - 2851
  • [6] CDSMOTE: class decomposition and synthetic minority class oversampling technique for imbalanced-data classification
    Elyan, Eyad
    Moreno-Garcia, Carlos Francisco
    Jayne, Chrisina
    Neural Computing and Applications, 2021, 33 (07) : 2839 - 2851
  • [7] CDSMOTE: class decomposition and synthetic minority class oversampling technique for imbalanced-data classification
    Eyad Elyan
    Carlos Francisco Moreno-Garcia
    Chrisina Jayne
    Neural Computing and Applications, 2021, 33 : 2839 - 2851
  • [8] A novel synthetic minority oversampling technique based on relative and absolute densities for imbalanced classification
    Liu, Ruijuan
    APPLIED INTELLIGENCE, 2023, 53 (01) : 786 - 803
  • [9] A novel synthetic minority oversampling technique based on relative and absolute densities for imbalanced classification
    Ruijuan Liu
    Applied Intelligence, 2023, 53 : 786 - 803
  • [10] Class-imbalanced Dynamical Financial Distress Prediction Based on Synthetic Minority Oversampling Technique and Local Weighted Scheme Integrated with Support Vector Machine
    Sun Jie
    Fu Bin-bin
    Li Hui
    Ai Wen-guo
    2018 25TH ANNUAL INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE & ENGINEERING, 2018, : 227 - 233