Semi-supervised possibilistic c-means clustering algorithm based on feature weights for imbalanced data

被引:10
|
作者
Yu, Haiyan [1 ]
Xu, Xiaoyu [1 ]
Li, Honglei [1 ]
Wu, Yuting [1 ]
Lei, Bo [1 ]
机构
[1] Xian Univ Posts & Telecommun, Sch Telecommun & Informat Engn, Xian 710121, Peoples R China
基金
中国国家自然科学基金;
关键词
Clustering; Possibilistic c -means clustering (PCM); Semi; -supervised; Feature weight; Imbalanced data; Image segmentation; MAHALANOBIS DISTANCE; FUZZY; ENTROPY;
D O I
10.1016/j.knosys.2024.111388
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The possibilistic c-means clustering (PCM) algorithm improves the robustness of fuzzy c-means clustering (FCM) to noise and outliers by releasing the probabilistic constraint of memberships. The semi-supervised possibilistic cmeans clustering (SSPCM) algorithm improves the clustering effect on datasets with imbalanced sizes by introducing a small amount of label information. However, the traditional semi-supervised algorithm still faces the problem of low utilization of supervision information for datasets with large differences in sample sizes. Moreover, the Euclidean distance, which treats features equally, cannot handle feature-imbalanced data. Therefore, this paper proposes a semi-supervised possibilistic c-means clustering algorithm based on feature weights (FW-SSPCM) by introducing the ideas of supervised centers. First, the algorithm introduces the supervised center into the objective function of the SSPCM to improve the utilization rate of supervision information and thus guide the center iteration of small clusters. Second, the feature weighting strategy is introduced in the objective function to adaptively assign feature weights according to the importance of different features in different clusters, thus improving the adaptability of the algorithm to feature-imbalanced datasets. In addition, to improve the robustness of the antinoise effect and retain additional image details, a new image segmentation algorithm based on FW-SSPCM and local information (LFW-SSPCM) is proposed by introducing local spatial information obtained by bilateral filtering. Finally, through clustering experiments on synthetic data, UCI datasets and on color images characteristic of multiple features, including imbalanced sizes, imbalanced features and strong noise injection, the clustering performances of the proposed FW-SSPCM and LFW-SSPCM proposed in this paper are significantly better than those of several related clustering algorithms.
引用
收藏
页数:37
相关论文
共 50 条
  • [1] Applications of semi-supervised subspace possibilistic fuzzy c-means clustering algorithm in IoT
    Zhang, Y. F.
    Zhang, Wei
    INFORMATION TECHNOLOGY AND COMPUTER APPLICATION ENGINEERING, 2014, : 7 - 10
  • [2] Semi-supervised Method with Spatial Weights based Possibilistic Fuzzy c-Means Clustering for Land-cover Classification
    Dinh-Sinh Mai
    Long Thanh Ngo
    PROCEEDINGS OF 2018 5TH NAFOSTED CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS 2018), 2018, : 406 - 411
  • [3] Cutset-type Possibilistic C-means Clustering Algorithms Based on Semi-supervised Information
    Fan Jiulun
    Gao Mengfei
    Yu Haiyan
    Chen Binbin
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (08) : 2378 - 2385
  • [4] Semi-supervised fuzzy c-means clustering of biological data
    Ceccarelli, M
    Maratea, A
    FUZZY LOGIC AND APPLICATIONS, 2006, 3849 : 259 - 266
  • [5] On Semi-Supervised Fuzzy c-Means Clustering
    Yasunori, Endo
    Yukihiro, Hamasuna
    Makito, Yamashiro
    Sadaaki, Miyamoto
    2009 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, 2009, : 1119 - +
  • [6] Objective Function of Semi-Supervised Fuzzy C-Means Clustering Algorithm
    Li, Chunfang
    Liu, Lianzhong
    Jiang, Wenli
    2008 6TH IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS, VOLS 1-3, 2008, : 704 - +
  • [7] Semi-Supervised Fuzzy C-Means Clustering Algorithm Based on Weighted Euclidean Distance
    Zhang, Peilin
    Xu, Chao
    Fu, Jianping
    Wang, Guode
    Li, Sheng
    2011 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER SCIENCE AND APPLICATION (FCSA 2011), VOL 3, 2011, : 114 - 117
  • [8] General Semi-supervised Possibilistic Fuzzy c-Means clustering for Land-cover Classification
    Dinh Sinh Mai
    Long Thanh Ngo
    PROCEEDINGS OF 2019 11TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE 2019), 2019, : 133 - 138
  • [9] Safe Semi-Supervised Fuzzy C-Means Clustering
    Gan, Haitao
    IEEE ACCESS, 2019, 7 : 95659 - 95664
  • [10] PKFCM - Proximity based Kernel Fuzzy C-Means for Semi-supervised Data Clustering
    Li, Jinbo
    Chen, Long
    PROCEEDINGS 2012 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2012, : 581 - 586