A variable-selection heuristic for K-means clustering

被引:98
|
作者
Brusco, MJ [1 ]
Cradit, JD [1 ]
机构
[1] Florida State Univ, Coll Business, Dept Mkt, Tallahassee, FL 32306 USA
关键词
cluster analysis; K-means partitioning; variable selection; heuristics;
D O I
10.1007/BF02294838
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
One of the most vexing problems in cluster analysis is the selection and/or weighting of variables in order to include those that truly define cluster structure, while eliminating those that might mask such structure. This paper presents a variable-selection heuristic For nonhierarchical (K-means) cluster analysis based on the adjusted Rand index for measuring cluster recovery. The heuristic was subjected to Monte Carlo testing across more than 2200 datasets with known cluster structure. The results indicate the heuristic is extremely effective at eliminating masking variables. A cluster analysis of real-world financial services data revealed that using the variable-selection heuristic prior to the K-means algorithm resulted in greater cluster stability.
引用
收藏
页码:249 / 270
页数:22
相关论文
共 50 条
  • [21] Improved initial clustering center selection algorithm for K-means
    Chen Lasheng
    Li Yuqiang
    2017 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA 2017), 2017, : 275 - 279
  • [22] K-means clustering method for auditory evoked potentials selection
    B. Gourevitch
    R. Le Bouquin-Jeannes
    Medical and Biological Engineering and Computing, 2003, 41 : 397 - 402
  • [23] Multiple Parallel MapReduce k-means Clustering with Validation and Selection
    Garcia, Kemilly Dearo
    Naldi, Murilo Coelho
    2014 BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 2014, : 432 - 437
  • [24] Exploring K-Means Clustering and skyline for Web Service Selection
    Purohit, Lalit
    Kumar, Sandeep
    2016 11TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2016, : 603 - 607
  • [25] Improved initial cluster center selection in K-means clustering
    Zhu, Minchen
    Wang, Weizhi
    Huang, Jingshan
    ENGINEERING COMPUTATIONS, 2014, 31 (08) : 1661 - 1667
  • [26] On the Efficiency of K-Means Clustering: Evaluation, Optimization, and Algorithm Selection
    Wang, Sheng
    Sun, Yuan
    Bao, Zhifeng
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 14 (02): : 163 - 175
  • [27] Flexible Subspace Clustering: A Joint Feature Selection and K-Means Clustering Framework
    Long, Zhong-Zhen
    Xu, Guoxia
    Du, Jiao
    Zhu, Hu
    Yan, Taiyu
    Yu, Yu-Feng
    BIG DATA RESEARCH, 2021, 23
  • [28] A New Selection Method of K-means Clustering with Initial Clustering Center Point
    Li, Wen-jun
    Zou, Hai-lin
    2ND INTERNATIONAL SYMPOSIUM ON COMPUTER NETWORK AND MULTIMEDIA TECHNOLOGY (CNMT 2010), VOLS 1 AND 2, 2010, : 580 - 582
  • [29] Geodesic K-means Clustering
    Asgharbeygi, Nima
    Maleki, Arian
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3450 - 3453
  • [30] Stability of k-means clustering
    Ben-David, Shai
    Pal, Ddvid
    Simon, Hans Ulrich
    LEARNING THEORY, PROCEEDINGS, 2007, 4539 : 20 - +