Selection of K in K-means clustering

被引:350
|
作者
Pham, DT [1 ]
Dimov, SS [1 ]
Nguyen, CD [1 ]
机构
[1] Cardiff Univ, Mfg Engn Ctr, Cardiff CF24 OYF, Wales
关键词
clustering; K-means algorithm; cluster number selection;
D O I
10.1243/095440605X8298
中图分类号
TH [机械、仪表工业];
学科分类号
0802 ;
摘要
The K-means algorithm is a popular data-clustering algorithm. However, one of its drawbacks is the requirement for the number of clusters, K, to be specified before the algorithm is applied. This paper first reviews existing methods for selecting the number of clusters for the algorithm. Factors that affect this selection are then discussed and a new measure to assist the selection is proposed. The paper concludes with an analysis of the results of using the proposed measure to determine the number of clusters for the K-means algorithm for different data sets.
引用
收藏
页码:103 / 119
页数:17
相关论文
共 50 条
  • [1] Stability and model selection in k-means clustering
    Ohad Shamir
    Naftali Tishby
    Machine Learning, 2010, 80 : 213 - 243
  • [2] Deterministic Feature Selection for k-Means Clustering
    Boutsidis, Christos
    Magdon-Ismail, Malik
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2013, 59 (09) : 6099 - 6110
  • [3] A Variable Selection Procedure for K-Means Clustering
    Kim, Sung-Soo
    KOREAN JOURNAL OF APPLIED STATISTICS, 2012, 25 (03) : 471 - 483
  • [4] Stability and model selection in k-means clustering
    Shamir, Ohad
    Tishby, Naftali
    MACHINE LEARNING, 2010, 80 (2-3) : 213 - 243
  • [5] Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm
    Shi Na
    Liu Xumin
    Guan Yong
    2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 63 - 67
  • [6] K-Means Cloning: Adaptive Spherical K-Means Clustering
    Hedar, Abdel-Rahman
    Ibrahim, Abdel-Monem M.
    Abdel-Hakim, Alaa E.
    Sewisy, Adel A.
    ALGORITHMS, 2018, 11 (10):
  • [7] K-means Clustering with Feature Selection for Stream Data
    Wang, Xiao-dong
    Chen, Rung-Ching
    Yan, Fei
    Hendry
    2018 INTERNATIONAL SYMPOSIUM ON COMPUTER, CONSUMER AND CONTROL (IS3C 2018), 2018, : 453 - 456
  • [8] Feature Selection Algorithm Based on K-means Clustering
    Tang, Xue
    Dong, Min
    Bi, Sheng
    Pei, Maofeng
    Cao, Dan
    Xie, Cheche
    Chi, Sunhuang
    2017 IEEE 7TH ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (CYBER), 2017, : 1522 - 1527
  • [9] A variable-selection heuristic for K-means clustering
    Michael J. Brusco
    J. Dennis Cradit
    Psychometrika, 2001, 66 : 249 - 270
  • [10] A variable-selection heuristic for K-means clustering
    Brusco, MJ
    Cradit, JD
    PSYCHOMETRIKA, 2001, 66 (02) : 249 - 270