Data decomposition for parallel K-means clustering

被引:0
|
作者
Gursoy, A [1 ]
机构
[1] Koc Univ, Dept Comp Engn, TR-34450 Istanbul, Turkey
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Developing fast algorithms for clustering has been an important area of research in data mining and other fields. K-means is one of the widely used clustering algorithms. In this work, we have developed and evaluated parallelization of k-means method for low-dimensional data on message passing computers. Three different data decomposition schemes and their impact on the pruning of distance calculations in tree-based k-means algorithm have been studied. Random pattern decomposition has good load balancing but fails to prune distance calculations effectively. Compact spatial decomposition of patterns based on space filling curves outperforms random pattern decomposition even though it has load imbalance problem. In both cases, parallel tree-based k-means clustering runs significantly faster than the traditional parallel k-means.
引用
收藏
页码:241 / 248
页数:8
相关论文
共 50 条
  • [41] Optimized data fusion for K-means Laplacian clustering
    Yu, Shi
    Liu, Xinhai
    Tranchevent, Leon-Charles
    Glanzel, Wolfgang
    Suykens, Johan A. K.
    De Moor, Bart
    Moreau, Yves
    BIOINFORMATICS, 2011, 27 (01) : 118 - 126
  • [42] Parallelization of K-Means Clustering Algorithm for Data Mining
    Jiang, Hao
    Yu, Liyan
    4TH ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS (ITA 2017), 2017, 12
  • [43] K-means Clustering with Feature Selection for Stream Data
    Wang, Xiao-dong
    Chen, Rung-Ching
    Yan, Fei
    Hendry
    2018 INTERNATIONAL SYMPOSIUM ON COMPUTER, CONSUMER AND CONTROL (IS3C 2018), 2018, : 453 - 456
  • [44] Online k-means Clustering on Arbitrary Data Streams
    Bhattacharjee, Robi
    Imola, Jacob John
    Moshkovitz, Michal
    Dasgupta, Sanjoy
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 204 - 236
  • [45] Optimized Data Fusion for Kernel k-Means Clustering
    Yu, Shi
    Tranchevent, Leon-Charles
    Liu, Xinhai
    Glanzel, Wolfgang
    Suykens, Johan A. K.
    De Moor, Bart
    Moreau, Yves
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (05) : 1031 - 1039
  • [46] Data clustering: 50 years beyond K-means
    Jain, Anil K.
    PATTERN RECOGNITION LETTERS, 2010, 31 (08) : 651 - 666
  • [47] Combining PSO and k-means to Enhance Data Clustering
    Ahmadyfard, Alireza
    Modares, Hamidreza
    2008 INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS, VOLS 1 AND 2, 2008, : 688 - 691
  • [48] An extension of the K-means algorithm to clustering skewed data
    Melnykov, Volodymyr
    Zhu, Xuwen
    COMPUTATIONAL STATISTICS, 2019, 34 (01) : 373 - 394
  • [49] On the quality of k-means clustering based on grouped data
    Kaeaerik, Meelis
    Paerna, Kalev
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2009, 139 (11) : 3836 - 3841
  • [50] How to Use K-means for Big Data Clustering?
    Mussabayev, Rustam
    Mladenovic, Nenad
    Jarboui, Bassem
    Mussabayev, Ravil
    PATTERN RECOGNITION, 2023, 137