Weighting variables in K-means clustering

被引:13
|
作者
Huh, Myung-Hoe [2 ]
Lim, Yong B. [1 ]
机构
[1] Ewha Womans Univ, Dept Stat, Seoul, South Korea
[2] Korea Univ, Dept Stat, Seoul, South Korea
关键词
K-means clustering; variable weighting; penalty constant;
D O I
10.1080/02664760802382533
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The aim of this study is to assign weights w1, , wm to m clustering variables Z1, , Zm, so that k groups were uncovered to reveal more meaningful within-group coherence. We propose a new criterion to be minimized, which is the sum of the weighted within-cluster sums of squares and the penalty for the heterogeneity in variable weights w1, , wm. We will present the computing algorithm for such k-means clustering, a working procedure to determine a suitable value of penalty constant and numerical examples, among which one is simulated and the other two are real.
引用
收藏
页码:67 / 78
页数:12
相关论文
共 50 条
  • [1] D-optimality criterion for weighting variables in K-means clustering
    Lim, Yong B.
    Park, Yeo Jung
    Huh, Myung-Hoe
    JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2009, 38 (04) : 391 - 396
  • [2] Standardization and weighting of variables for the fuzzy K-means clustering of discontinuity data
    Hammah, RE
    Curran, JH
    PACIFIC ROCKS 2000: ROCK AROUND THE RIM, 2000, : 659 - 666
  • [3] D-optimality criterion for weighting variables in K-means clustering
    Yong B. Lim
    Yeo Jung Park
    Myung-Hoe Huh
    Journal of the Korean Statistical Society, 2009, 38 : 391 - 396
  • [4] Feature weighting in k-means clustering
    Modha, DS
    Spangler, WS
    MACHINE LEARNING, 2003, 52 (03) : 217 - 237
  • [5] Feature Weighting in k-Means Clustering
    Dharmendra S. Modha
    W. Scott Spangler
    Machine Learning, 2003, 52 : 217 - 237
  • [6] Standardizing variables in K-means clustering
    Steinley, D
    CLASSIFICATION, CLUSTERING, AND DATA MINING APPLICATIONS, 2004, : 53 - 60
  • [7] A Heuristically Weighting K-Means Algorithm for Subspace Clustering
    Li, Boyang
    Jiang, Qingshan
    Chen, Lifei
    2008 2ND INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY AND IDENTIFICATION, 2008, : 268 - +
  • [8] Automated variable weighting in k-means type clustering
    Huang, JZX
    Ng, MK
    Rong, HQ
    Li, ZC
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (05) : 657 - 668
  • [9] On the performance of feature weighting K-means for text subspace clustering
    Jing, LP
    Ng, MK
    Xu, J
    Huang, JZX
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 502 - 512
  • [10] An iterative algorithm for optimal variable weighting in K-means clustering
    Zhang, Shaonan
    Li, Shanshan
    Hu, Jiaqiao
    Xing, Haipeng
    Zhu, Wei
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2019, 48 (05) : 1346 - 1365