Efficient k-Means on GPUs

被引:6
|
作者
Lutz, Clemens [1 ]
Bress, Sebastian [1 ]
Rabl, Tilmann [2 ]
Zeuch, Steffen [1 ]
Markl, Volker [2 ]
机构
[1] DFKI GmbH, Kaiserslautern, Germany
[2] TU Berlin, Berlin, Germany
关键词
D O I
10.1145/3211922.3211925
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
k-Means is a versatile clustering algorithm widely-used in practice. To cluster large data sets, state-of-the-art implementations use GPUs to shorten the data to knowledge time. These implementations commonly assign points on a GPU and update centroids on a CPU. We show that this approach has two main drawbacks. First, it separates the two algorithm phases over different processors, which requires an expensive data exchange between devices. Second, even when both phases are computed on the GPU, the same data are read twice per iteration, leading to inefficient use of memory bandwidth. In this paper, we describe a new approach that executes k-means in a single data pass per iteration. We propose a new algorithm to updates centroids that allows us to perform both phases efficiently on GPUs. Thereby, we remove data transfers within each iteration. We fuse both phases to eliminate artificial synchronization barriers, and thus compute k-means in a single data pass. Overall, we achieve up to 20x higher throughput compared to the state-of-the-art approach.
引用
收藏
页数:3
相关论文
共 50 条
  • [31] An efficient approximation to the K-means clustering for massive data
    Capo, Marco
    Perez, Aritz
    Lozano, Jose A.
    KNOWLEDGE-BASED SYSTEMS, 2017, 117 : 56 - 69
  • [32] Memory and Communication Efficient Federated Kernel k-Means
    Zhou, Xiaochen
    Wang, Xudong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (05) : 7114 - 7125
  • [33] K-Means Cloning: Adaptive Spherical K-Means Clustering
    Hedar, Abdel-Rahman
    Ibrahim, Abdel-Monem M.
    Abdel-Hakim, Alaa E.
    Sewisy, Adel A.
    ALGORITHMS, 2018, 11 (10):
  • [34] Empirical Evaluation of K-Means, Bisecting K-Means, Fuzzy C-Means and Genetic K-Means Clustering Algorithms
    Banerjee, Shreya
    Choudhary, Ankit
    Pal, Somnath
    2015 IEEE INTERNATIONAL WIE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE), 2015, : 172 - 176
  • [35] Deep k-Means: Jointly clustering with k-Means and learning representations
    Fard, Maziar Moradi
    Thonet, Thibaut
    Gaussier, Eric
    PATTERN RECOGNITION LETTERS, 2020, 138 : 185 - 192
  • [36] Clustering of Image Data Using K-Means and Fuzzy K-Means
    Rahmani, Md. Khalid Imam
    Pal, Naina
    Arora, Kamiya
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (07) : 160 - 163
  • [37] K and starting means for k-means algorithm
    Fahim, Ahmed
    JOURNAL OF COMPUTATIONAL SCIENCE, 2021, 55
  • [38] Learning the k in k-means
    Hamerly, G
    Elkan, C
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 281 - 288
  • [39] GK-means: An Efficient K-means Clustering Algorithm Based On Grid
    Chen, Xiaoyun
    Su, Youli
    Chen, Yi
    Liu, Guohua
    2009 INTERNATIONAL SYMPOSIUM ON COMPUTER NETWORK AND MULTIMEDIA TECHNOLOGY (CNMT 2009), VOLUMES 1 AND 2, 2009, : 531 - 534
  • [40] PSO Aided k-Means Clustering: Introducing Connectivity in k-Means
    Breaban, Mihaela Elena
    Luchian, Henri
    GECCO-2011: PROCEEDINGS OF THE 13TH ANNUAL GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2011, : 1227 - 1234