Efficient k-Means on GPUs

被引:6
|
作者
Lutz, Clemens [1 ]
Bress, Sebastian [1 ]
Rabl, Tilmann [2 ]
Zeuch, Steffen [1 ]
Markl, Volker [2 ]
机构
[1] DFKI GmbH, Kaiserslautern, Germany
[2] TU Berlin, Berlin, Germany
关键词
D O I
10.1145/3211922.3211925
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
k-Means is a versatile clustering algorithm widely-used in practice. To cluster large data sets, state-of-the-art implementations use GPUs to shorten the data to knowledge time. These implementations commonly assign points on a GPU and update centroids on a CPU. We show that this approach has two main drawbacks. First, it separates the two algorithm phases over different processors, which requires an expensive data exchange between devices. Second, even when both phases are computed on the GPU, the same data are read twice per iteration, leading to inefficient use of memory bandwidth. In this paper, we describe a new approach that executes k-means in a single data pass per iteration. We propose a new algorithm to updates centroids that allows us to perform both phases efficiently on GPUs. Thereby, we remove data transfers within each iteration. We fuse both phases to eliminate artificial synchronization barriers, and thus compute k-means in a single data pass. Overall, we achieve up to 20x higher throughput compared to the state-of-the-art approach.
引用
收藏
页数:3
相关论文
共 50 条
  • [1] Speeding up k-Means algorithm by GPUs
    Li, You
    Zhao, Kaiyong
    Chu, Xiaowen
    Liu, Jiming
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2013, 79 (02) : 216 - 229
  • [2] K-means - a fast and efficient K-means algorithms
    Nguyen C.D.
    Duong T.H.
    Nguyen, Cuong Duc (nguyenduccuong@tdt.edu.vn), 2018, Inderscience Publishers, 29, route de Pre-Bois, Case Postale 856, CH-1215 Geneva 15, CH-1215, Switzerland (11) : 27 - 45
  • [3] Efficient and Scalable k‑Means on GPUs
    Clemens Lutz
    Sebastian Breß
    Tilmann Rabl
    Steffen Zeuch
    Volker Markl
    Datenbank-Spektrum, 2018, 18 (3) : 157 - 169
  • [4] Large scale K-means clustering using GPUs
    Li, Mi
    Frank, Eibe
    Pfahringer, Bernhard
    DATA MINING AND KNOWLEDGE DISCOVERY, 2023, 37 (01) : 67 - 109
  • [5] Large scale K-means clustering using GPUs
    Mi Li
    Eibe Frank
    Bernhard Pfahringer
    Data Mining and Knowledge Discovery, 2023, 37 : 67 - 109
  • [6] K*-Means: An Effective and Efficient K-means Clustering Algorithm
    Qi, Jianpeng
    Yu, Yanwei
    Wang, Lihong
    Liu, Jinglei
    PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCES ON BIG DATA AND CLOUD COMPUTING (BDCLOUD 2016) SOCIAL COMPUTING AND NETWORKING (SOCIALCOM 2016) SUSTAINABLE COMPUTING AND COMMUNICATIONS (SUSTAINCOM 2016) (BDCLOUD-SOCIALCOM-SUSTAINCOM 2016), 2016, : 242 - 249
  • [7] Transformer Autoencoder for K-means Efficient clustering
    Wu, Wenhao
    Wang, Weiwei
    Jia, Xixi
    Feng, Xiangchu
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [8] Efficient online spherical K-means clustering
    Zhong, S
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 3180 - 3185
  • [9] Efficient Privacy Preserving K-Means Clustering
    Upmanyu, Maneesh
    Namboodiri, Anoop M.
    Srinathan, Kannan
    Jawahar, C. V.
    INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2010, 6122 : 154 - 166
  • [10] Efficient enhanced k-means clustering algorithm
    Fahim A.M.
    Salem A.M.
    Torkey F.A.
    Ramadan M.A.
    Journal of Zhejiang University-SCIENCE A, 2006, 7 (10): : 1626 - 1633