An Efficient Dimension Reduction Technique for Basic K-Means Clustering Algorithm

被引:0
|
作者
Usman, Dauda [1 ]
Mohamad, Ismail [1 ]
机构
[1] Univ Teknol Malaysia, Fac Sci, Dept Math Sci, Johor Baharu 81310, Johor Darul Taa, Malaysia
关键词
Decimal Scaling; K-Means Clustering; Min-Max; Principal Component Analysis; Standardization; z-score;
D O I
暂无
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
K-means clustering is being widely studied problem in a variety of application domains. The computational complexity of the basic k-means is very high, the number of distance calculations also increases with the increase of the dimensionality of the data. Several algorithms have been proposed to improve the performance of the basic k-means. Here we investigate the behavior of the basic k-means clustering algorithm and two alternatives to it, we have analyzed the performances of three different standardization methods. Equivalently, we prove that z-score and principal components are the best preprocessing methods that will simplify the analysis and visualize the multidimensional dataset. The analyzed result revealed that the z-score outperform min-max and decimal scaling also principal component analysis picks up the dimensions with the largest variances. Our results also provide effective ways to solve the k-means clustering problems.
引用
收藏
页码:253 / 267
页数:15
相关论文
共 50 条
  • [41] A hybrid clustering technique combining a novel genetic algorithm with K-Means
    Rahman, Md Anisur
    Islam, Md Zahidul
    KNOWLEDGE-BASED SYSTEMS, 2014, 71 : 345 - 365
  • [42] K-Means Clustering Efficient Algorithm with Initial Class Center Selection
    Huang Suyu
    Hu Pingfang
    PROCEEDINGS OF THE 2018 3RD INTERNATIONAL WORKSHOP ON MATERIALS ENGINEERING AND COMPUTER SCIENCES (IWMECS 2018), 2018, 78 : 301 - 305
  • [43] A comparative study of efficient initialization methods for the k-means clustering algorithm
    Celebi, M. Emre
    Kingravi, Hassan A.
    Vela, Patricio A.
    EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (01) : 200 - 210
  • [44] An efficient greedy K-means algorithm for global gene trajectory clustering
    Chan, ZSH
    Collins, L
    Kasabov, N
    EXPERT SYSTEMS WITH APPLICATIONS, 2006, 30 (01) : 137 - 141
  • [45] An Efficient Data Structure for Document Clustering Using K-Means Algorithm
    Killani, Ramanji
    Satapathy, Suresh Chandra
    Sowjanya, A. M.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS 2012 (INDIA 2012), 2012, 132 : 337 - +
  • [46] eXploratory K-Means: A new simple and efficient algorithm for gene clustering
    Lam, Yau King
    Tsang, Peter W. M.
    APPLIED SOFT COMPUTING, 2012, 12 (03) : 1149 - 1157
  • [47] Soil data clustering by using K-means and fuzzy K-means algorithm
    Hot, Elma
    Popovic-Bugarin, Vesna
    2015 23RD TELECOMMUNICATIONS FORUM TELFOR (TELFOR), 2015, : 890 - 893
  • [48] IMPROVEMENT IN K-MEANS CLUSTERING ALGORITHM FOR DATA CLUSTERING
    Rajeswari, K.
    Acharya, Omkar
    Sharma, Mayur
    Kopnar, Mahesh
    Karandikar, Kiran
    1ST INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION ICCUBEA 2015, 2015, : 367 - 369
  • [49] On K-means Data Clustering Algorithm with Genetic Algorithm
    Kapil, Shruti
    Chawla, Meenu
    Ansari, Mohd Dilshad
    2016 FOURTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2016, : 202 - 206
  • [50] Randomized Dimensionality Reduction for k-Means Clustering
    Boutsidis, Christos
    Zouzias, Anastasios
    Mahoney, Michael W.
    Drineas, Petros
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2015, 61 (02) : 1045 - 1062