Efficient and Reliable Clustering by Parallel Random Swap Algorithm

被引:2
|
作者
Nigro, Libero [1 ]
Cicirelli, Franco [2 ]
Franti, Pasi [3 ]
机构
[1] Univ Calabria, DIMES Dept Informat Modelling Elect & Syst Sci, I-87036 Arcavacata Di Rende, Italy
[2] Natl Res Council Italy, CNR, Inst High Performance Comp & Networking ICAR, I-87036 Arcavacata Di Rende, Italy
[3] Univ Eastern Finland, Sch Comp, Machine Learning Grp, POB 111, Joensuu 80101, Finland
关键词
Clustering problem; K-Means; Random swap; Parallelism; Streams; Lambda Expressions; !text type='Java']Java[!/text; K-MEANS;
D O I
10.1109/DS-RT55542.2022.9932090
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Solving large-scale clustering problems requires an efficient algorithm which can be implemented also in parallel. Kmeans would be suitable but it can lead to an inaccurate clustering result. To overcome this problem, we present a parallel version of random swap clustering algorithm. It combines the scalability of k-means with high clustering accuracy. The new clustering method is experimented on top of Java parallel streams and lambda expressions, which offer interesting execution time benefits. The method is applied to standard benchmark datasets, with a varying population size and distribution of managed records, dimensionality of data points and the number of clusters. The experimental results confirm that high quality clustering can be obtained by parallel random swap together with a high time efficiency.
引用
收藏
页数:4
相关论文
共 50 条
  • [31] Efficient parallel implementation of a density peaks clustering algorithm on graphics processing unit
    Ke-shi Ge
    Hua-you Su
    Dong-sheng Li
    Xi-cheng Lu
    Frontiers of Information Technology & Electronic Engineering, 2017, 18 : 915 - 927
  • [32] Efficient Parallel Random Rearrange
    Miraut Andres, David
    Pastor Perez, Luis
    INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE, 2011, 91 : 183 - 190
  • [33] Yet Another Efficient Algorithm for the Swap Matching Problem
    Ahmed, Pritom
    Islam, A. S. M. Sohidull
    Rahman, M. Sohel
    2012 INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV), 2012, : 336 - 341
  • [34] A parallel algorithm for random searches
    Wosniack, M. E.
    Raposo, E. P.
    Viswanathan, G. M.
    da Luz, M. G. E.
    COMPUTER PHYSICS COMMUNICATIONS, 2015, 196 : 390 - 397
  • [35] Detecting taxi movements using Random Swap clustering and sequential pattern mining
    Rami Ibrahim
    M. Omair Shafiq
    Journal of Big Data, 6
  • [36] Detecting taxi movements using Random Swap clustering and sequential pattern mining
    Ibrahim, Rami
    Shafiq, M. Omair
    JOURNAL OF BIG DATA, 2019, 6 (01)
  • [37] Efficient parallel hierarchical clustering algorithms
    Rajasekaran, S
    PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS, 2004, : 27 - 32
  • [38] Efficient parallel hierarchical clustering algorithms
    Rajasekaran, S
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2005, 16 (06) : 497 - 502
  • [39] An efficient clustering-based task scheduling algorithm for parallel programs with task duplication
    Lin, Wei-Ming
    Gu, Qiuyan
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2007, 23 (02) : 589 - 604
  • [40] Parallel genetic algorithm for constrained clustering
    Han, MM
    Tatsumi, S
    Kitamura, Y
    Okumoto, T
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1997, E80A (02) : 416 - 422