Selection of K in K-means clustering

被引:350
|
作者
Pham, DT [1 ]
Dimov, SS [1 ]
Nguyen, CD [1 ]
机构
[1] Cardiff Univ, Mfg Engn Ctr, Cardiff CF24 OYF, Wales
关键词
clustering; K-means algorithm; cluster number selection;
D O I
10.1243/095440605X8298
中图分类号
TH [机械、仪表工业];
学科分类号
0802 ;
摘要
The K-means algorithm is a popular data-clustering algorithm. However, one of its drawbacks is the requirement for the number of clusters, K, to be specified before the algorithm is applied. This paper first reviews existing methods for selecting the number of clusters for the algorithm. Factors that affect this selection are then discussed and a new measure to assist the selection is proposed. The paper concludes with an analysis of the results of using the proposed measure to determine the number of clusters for the K-means algorithm for different data sets.
引用
收藏
页码:103 / 119
页数:17
相关论文
共 50 条
  • [21] Balanced K-Means for Clustering
    Malinen, Mikko I.
    Franti, Pasi
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2014, 8621 : 32 - 41
  • [22] Discriminative k-Means Clustering
    Arandjelovic, Ognjen
    2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,
  • [23] K-Means Clustering Explained
    Emerson, Robert Wall
    JOURNAL OF VISUAL IMPAIRMENT & BLINDNESS, 2024, 118 (01) : 65 - 66
  • [24] Subspace K-means clustering
    Timmerman, Marieke E.
    Ceulemans, Eva
    De Roover, Kim
    Van Leeuwen, Karla
    BEHAVIOR RESEARCH METHODS, 2013, 45 (04) : 1011 - 1023
  • [25] Spherical k-Means Clustering
    Hornik, Kurt
    Feinerer, Ingo
    Kober, Martin
    Buchta, Christian
    JOURNAL OF STATISTICAL SOFTWARE, 2012, 50 (10): : 1 - 22
  • [26] Power k-Means Clustering
    Xu, Jason
    Lange, Kenneth
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [27] Subspace K-means clustering
    Marieke E. Timmerman
    Eva Ceulemans
    Kim De Roover
    Karla Van Leeuwen
    Behavior Research Methods, 2013, 45 : 1011 - 1023
  • [28] K-means clustering on CGRA
    Lopes, Joao D.
    de Sousa, Jose T.
    Neto, Horacio
    Vestias, Mario
    2017 27TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2017,
  • [29] k-means clustering of extremes
    Janssen, Anja
    Wan, Phyllis
    ELECTRONIC JOURNAL OF STATISTICS, 2020, 14 (01): : 1211 - 1233
  • [30] Online k-means Clustering
    Cohen-Addad, Vincent
    Guedj, Benjamin
    Kanade, Varun
    Rom, Guy
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130