Joint Sample Position Based Noise Filtering and Mean Shift Clustering for Imbalanced Classification Learning

被引:1
|
作者
Duan, Lilong [1 ,2 ]
Xue, Wei [1 ,2 ]
Huang, Jun [1 ,2 ]
Zheng, Xiao [1 ,2 ]
机构
[1] Anhui Univ Technol, Sch Comp Sci & Technol, Maanshan 243032, Peoples R China
[2] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei 230088, Peoples R China
来源
TSINGHUA SCIENCE AND TECHNOLOGY | 2024年 / 29卷 / 01期
关键词
Clustering algorithms; Filtering algorithms; Benchmark testing; Sampling methods; Information filters; Cleaning; Classification algorithms; imbalanced data classification; oversampling; noise filtering; clustering; OVERSAMPLING TECHNIQUE; SMOTE; PREDICTION;
D O I
10.26599/TST.2023.9010006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of imbalanced data classification learning has received much attention. Conventional classification algorithms are susceptible to data skew to favor majority samples and ignore minority samples. Majority weighted minority oversampling technique (MWMOTE) is an effective approach to solve this problem, however, it may suffer from the shortcomings of inadequate noise filtering and synthesizing the same samples as the original minority data. To this end, we propose an improved MWMOTE method named joint sample position based noise filtering and mean shift clustering (SPMSC) to solve these problems. Firstly, in order to effectively eliminate the effect of noisy samples, SPMSC uses a new noise filtering mechanism to determine whether a minority sample is noisy or not based on its position and distribution relative to the majority sample. Note that MWMOTE may generate duplicate samples, we then employ the mean shift algorithm to cluster minority samples to reduce synthetic replicate samples. Finally, data cleaning is performed on the processed data to further eliminate class overlap. Experiments on extensive benchmark datasets demonstrate the effectiveness of SPMSC compared with other sampling methods.
引用
收藏
页码:216 / 231
页数:16
相关论文
共 50 条
  • [1] Sample Clustering for Fast Classification by Using the Mean Shift Procedure
    Liang Lie-quan
    Liang Ying-hong
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, VOL II, 2009, : 179 - 183
  • [2] Clustering-based incremental learning for imbalanced data classification
    Liu, Yuxin
    Du, Guangyu
    Yin, Chenke
    Zhang, Haichao
    Wang, Jia
    KNOWLEDGE-BASED SYSTEMS, 2024, 292
  • [3] Clustering-based incremental learning for imbalanced data classification
    Liu, Yuxin
    Du, Guangyu
    Yin, Chenke
    Zhang, Hachao
    Wang, Jia
    Knowledge-Based Systems, 2024, 292
  • [4] Imbalanced Data Classification Based on Clustering
    Li, Hu
    Zou, Peng
    Han, Weihong
    Xia, Rongze
    COMPUTER-AIDED DESIGN, MANUFACTURING, MODELING AND SIMULATION III, 2014, 443 : 741 - 745
  • [5] Mean shift based clustering in high dimensions: A texture classification example
    Georgescu, B
    Shimshoni, I
    Meer, P
    NINTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS I AND II, PROCEEDINGS, 2003, : 456 - 463
  • [6] Joint Debiased Representation Learning and Imbalanced Data Clustering
    Rezaei, Mina
    Dorigatti, Emilio
    Ruegamer, David
    Bischl, Bernd
    2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW, 2022, : 55 - 62
  • [7] Classification of imbalanced data with random sets and mean-variance filtering
    Nikulin, Vladimir
    International Journal of Data Warehousing and Mining, 2008, 4 (02) : 63 - 78
  • [8] Linear Spectral Clustering with Mean Shift Filtering for Superpixel Segmentation
    Baek, Jiyeon
    Chung, Byungjin
    Yim, Changhoon
    2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2018, : 76 - 79
  • [9] Ship infrared object segmentation based on mean shift filtering and graph spectral clustering
    Tao, Wen-Bing
    Jin, Hai
    Hongwai Yu Haomibo Xuebao/Journal of Infrared and Millimeter Waves, 2007, 26 (01): : 61 - 64
  • [10] Ship infrared object segmentation based on mean shift filtering and graph spectral clustering
    Tao Wen-Bing
    Jin Hai
    JOURNAL OF INFRARED AND MILLIMETER WAVES, 2007, 26 (01) : 61 - 64