Variable Weighting in Fuzzy k-Means Clustering to Determine the Number of Clusters

被引:38
|
作者
Khan, Imran [1 ]
Luo, Zongwei [1 ]
Huang, Joshua Zhexue [2 ]
Shahzad, Waseem [3 ]
机构
[1] Southern Univ Sci & Technol, Dept Comp Sci & Engn, Shenzhen Key Lab Computat Intelligence, Shenzhen 518055, Guangdong, Peoples R China
[2] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China
[3] Natl Univ Comp & Emerging Sci, Dept Comp Sci, Islamabad 44000, Pakistan
关键词
Fuzzy k-means; clustering; number of clusters; data mining; variable weighting; MEANS ALGORITHM; DATA SETS; SELECTION; CENTERS; MODEL;
D O I
10.1109/TKDE.2019.2911582
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the most significant problems in cluster analysis is to determine the number of clusters in unlabeled data, which is the input for most clustering algorithms. Some methods have been developed to address this problem. However, little attention has been paid on algorithms that are insensitive to the initialization of cluster centers and utilize variable weights to recover the number of clusters. To fill this gap, we extend the standard fuzzy k-means clustering algorithm. It can automatically determine the number of clusters by iteratively calculating the weights of all variables and the membership value of each object in all clusters. Two new steps are added to the fuzzy k-means clustering process. One of them is to introduce a penalty term to make the clustering process insensitive to the initial cluster centers. The other one is to utilize a formula for iterative updating of variable weights in each cluster based on the current partition of data. Experimental results on real-world and synthetic datasets have shown that the proposed algorithm effectively determined the correct number of clusters while initializing the different number of cluster centroids. We also tested the proposed algorithm on gene data to determine a subset of important genes.
引用
收藏
页码:1838 / 1853
页数:16
相关论文
共 50 条
  • [31] Fuzzy K-Means Clustering on Infrasound Sample
    Wang, Wei
    Wei, Shimin
    Liao, Qizheng
    Xia, Yaqin
    Li, Danlin
    Li, Junzi
    2008 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-5, 2008, : 756 - +
  • [32] On the performance of feature weighting K-means for text subspace clustering
    Jing, LP
    Ng, MK
    Xu, J
    Huang, JZX
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 502 - 512
  • [33] An entropy-based initialization method of K-means clustering on the optimal number of clusters
    Kuntal Chowdhury
    Debasis Chaudhuri
    Arup Kumar Pal
    Neural Computing and Applications, 2021, 33 : 6965 - 6982
  • [34] Empirical Evaluation of K-Means, Bisecting K-Means, Fuzzy C-Means and Genetic K-Means Clustering Algorithms
    Banerjee, Shreya
    Choudhary, Ankit
    Pal, Somnath
    2015 IEEE INTERNATIONAL WIE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE), 2015, : 172 - 176
  • [35] An entropy-based initialization method of K-means clustering on the optimal number of clusters
    Chowdhury, Kuntal
    Chaudhuri, Debasis
    Pal, Arup Kumar
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (12): : 6965 - 6982
  • [36] k*-means -: A generalized k-means clustering algorithm with unknown cluster number
    Cheung, YM
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2002, 2002, 2412 : 307 - 317
  • [37] A variable-selection heuristic for K-means clustering
    Michael J. Brusco
    J. Dennis Cradit
    Psychometrika, 2001, 66 : 249 - 270
  • [38] A variable-selection heuristic for K-means clustering
    Brusco, MJ
    Cradit, JD
    PSYCHOMETRIKA, 2001, 66 (02) : 249 - 270
  • [39] Variable neighborhood search algorithm for k-means clustering
    Orlov, V. I.
    Kazakovtsev, L. A.
    Rozhnov, I. P.
    Popov, N. A.
    Fedosov, V. V.
    IX INTERNATIONAL MULTIDISCIPLINARY SCIENTIFIC AND RESEARCH CONFERENCE MODERN ISSUES IN SCIENCE AND TECHNOLOGY / WORKSHOP ADVANCED TECHNOLOGIES IN AEROSPACE, MECHANICAL AND AUTOMATION ENGINEERING, 2018, 450
  • [40] Variable lag variography using k-means clustering
    Kapageridis, I. K.
    COMPUTERS & GEOSCIENCES, 2015, 85 : 49 - 63