Semi-supervised possibilistic c-means clustering algorithm based on feature weights for imbalanced data

被引:10
|
作者
Yu, Haiyan [1 ]
Xu, Xiaoyu [1 ]
Li, Honglei [1 ]
Wu, Yuting [1 ]
Lei, Bo [1 ]
机构
[1] Xian Univ Posts & Telecommun, Sch Telecommun & Informat Engn, Xian 710121, Peoples R China
基金
中国国家自然科学基金;
关键词
Clustering; Possibilistic c -means clustering (PCM); Semi; -supervised; Feature weight; Imbalanced data; Image segmentation; MAHALANOBIS DISTANCE; FUZZY; ENTROPY;
D O I
10.1016/j.knosys.2024.111388
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The possibilistic c-means clustering (PCM) algorithm improves the robustness of fuzzy c-means clustering (FCM) to noise and outliers by releasing the probabilistic constraint of memberships. The semi-supervised possibilistic cmeans clustering (SSPCM) algorithm improves the clustering effect on datasets with imbalanced sizes by introducing a small amount of label information. However, the traditional semi-supervised algorithm still faces the problem of low utilization of supervision information for datasets with large differences in sample sizes. Moreover, the Euclidean distance, which treats features equally, cannot handle feature-imbalanced data. Therefore, this paper proposes a semi-supervised possibilistic c-means clustering algorithm based on feature weights (FW-SSPCM) by introducing the ideas of supervised centers. First, the algorithm introduces the supervised center into the objective function of the SSPCM to improve the utilization rate of supervision information and thus guide the center iteration of small clusters. Second, the feature weighting strategy is introduced in the objective function to adaptively assign feature weights according to the importance of different features in different clusters, thus improving the adaptability of the algorithm to feature-imbalanced datasets. In addition, to improve the robustness of the antinoise effect and retain additional image details, a new image segmentation algorithm based on FW-SSPCM and local information (LFW-SSPCM) is proposed by introducing local spatial information obtained by bilateral filtering. Finally, through clustering experiments on synthetic data, UCI datasets and on color images characteristic of multiple features, including imbalanced sizes, imbalanced features and strong noise injection, the clustering performances of the proposed FW-SSPCM and LFW-SSPCM proposed in this paper are significantly better than those of several related clustering algorithms.
引用
收藏
页数:37
相关论文
共 50 条
  • [31] Effects of Semi-supervised Learning on Rough Set-Based C-Means Clustering
    Ubukata, Seiki
    Shimizu, Takeaki
    Notsu, Akira
    Honda, Katsuhiro
    2018 INTERNATIONAL CONFERENCE ON FUZZY THEORY AND ITS APPLICATIONS (IFUZZY), 2018, : 12 - 17
  • [32] Medical Image Segmentation Using Seeded Fuzzy C-means: A Semi-supervised Clustering Algorithm
    Santos, Luis
    Veras, Rodrigo
    Aires, Kelson
    Britto, Laurindo
    Machado, Vinicius
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [33] An Adaptive and Semi-Supervised Fuzzy C-means Clustering Algorithm for Remotely Sensed Change Detection
    Shao P.
    Fan H.
    Gao Z.
    Journal of Geo-Information Science, 2022, 24 (03) : 508 - 521
  • [34] A Comparison of Distance-based Semi-Supervised Fuzzy c-Means Clustering Algorithms
    Lai, Daphne Teck Ching
    Garibaldi, Jonathan M.
    IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 1580 - 1586
  • [35] Suppressed possibilistic fuzzy c-means clustering based on shadow sets for noisy data with imbalanced sizes
    Yu, Haiyan
    Li, Honglei
    Xu, Xiaoyu
    Gao, Qian
    Lan, Rong
    APPLIED SOFT COMPUTING, 2024, 167
  • [36] A gradient ascent algorithm based on possibilistic fuzzy C-Means for clustering noisy data
    Saberi, Hossein
    Sharbati, Reza
    Farzanegan, Behzad
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 191
  • [37] Research on Oil Atomic Spectrometric Data Semi-Supervised Fuzzy C-Means Clustering Based on Parzen Window
    Xu Chao
    Zhang Pei-lin
    Ren Guo-quan
    Wu Ding-hai
    SPECTROSCOPY AND SPECTRAL ANALYSIS, 2010, 30 (08) : 2175 - 2178
  • [38] A SEMI-SUPERVISED LEARNING ALGORITHM BASED ON SVM FOR IMBALANCED DATA
    Du, Limin
    Xu, Yang
    He, Xingxing
    UNCERTAINTY MODELLING IN KNOWLEDGE ENGINEERING AND DECISION MAKING, 2016, 10 : 194 - 200
  • [39] Semi-supervised kernel-based fuzzy c-means
    Zhang, DQ
    Tan, KR
    Chen, SC
    NEURAL INFORMATION PROCESSING, 2004, 3316 : 1229 - 1234
  • [40] Text Categorization using the Semi-Supervised Fuzzy c-Means Algorithm
    Benkhalifa, M
    Bensaid, A
    18TH INTERNATIONAL CONFERENCE OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY - NAFIPS, 1999, : 561 - 565