Efficient and Reliable Clustering by Parallel Random Swap Algorithm

被引:2
|
作者
Nigro, Libero [1 ]
Cicirelli, Franco [2 ]
Franti, Pasi [3 ]
机构
[1] Univ Calabria, DIMES Dept Informat Modelling Elect & Syst Sci, I-87036 Arcavacata Di Rende, Italy
[2] Natl Res Council Italy, CNR, Inst High Performance Comp & Networking ICAR, I-87036 Arcavacata Di Rende, Italy
[3] Univ Eastern Finland, Sch Comp, Machine Learning Grp, POB 111, Joensuu 80101, Finland
关键词
Clustering problem; K-Means; Random swap; Parallelism; Streams; Lambda Expressions; !text type='Java']Java[!/text; K-MEANS;
D O I
10.1109/DS-RT55542.2022.9932090
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Solving large-scale clustering problems requires an efficient algorithm which can be implemented also in parallel. Kmeans would be suitable but it can lead to an inaccurate clustering result. To overcome this problem, we present a parallel version of random swap clustering algorithm. It combines the scalability of k-means with high clustering accuracy. The new clustering method is experimented on top of Java parallel streams and lambda expressions, which offer interesting execution time benefits. The method is applied to standard benchmark datasets, with a varying population size and distribution of managed records, dimensionality of data points and the number of clusters. The experimental results confirm that high quality clustering can be obtained by parallel random swap together with a high time efficiency.
引用
收藏
页数:4
相关论文
共 50 条
  • [41] Parallel algorithm for extended star clustering
    Gil-García, R
    Badía-Contelles, JM
    Pons-Porrata, A
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, 2004, 3287 : 402 - 409
  • [42] Designing an efficient parallel spectral clustering algorithm on multi-core processors in Julia
    Huo, Zenan
    Mei, Gang
    Casolla, Giampaolo
    Giampaolo, Fabio
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2020, 138 : 211 - 221
  • [43] An Efficient MapReduce-Based Parallel Clustering Algorithm for Distributed Traffic Subarea Division
    Xia, Dawen
    Wang, Binfeng
    Li, Yantao
    Rong, Zhuobo
    Zhang, Zili
    DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2015, 2015
  • [44] An adaptive parallel hierarchical clustering algorithm
    Li, Zhaopeng
    Li, Kenli
    Xiao, Degui
    Yang, Lei
    HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, PROCEEDINGS, 2007, 4782 : 97 - 107
  • [45] Parallel clustering algorithm by deterministic annealing
    Yang, Guangwen
    Shi, Shuming
    2003, Press of Tsinghua University (43):
  • [46] A Parallel Clustering Algorithm with MPI - MKmeans
    Zhang, Jing
    Wu, Gongqing
    Hu, Xuegang
    Li, Shiying
    Hao, Shuilong
    JOURNAL OF COMPUTERS, 2013, 8 (01) : 10 - 17
  • [47] A low overhead parallel clustering algorithm
    Gharib, TF
    El-Ghazawi, T
    PDPTA'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, 2001, : 864 - 867
  • [48] A Parallel Elastic Net Clustering Algorithm
    Feng, Tzu-Yi
    Tsai, Chun-Wei
    Chiang, Ming-Chao
    Yang, Chu-Sing
    2018 IEEE INTERNATIONAL CONFERENCE ON SMART INTERNET OF THINGS (SMARTIOT 2018), 2018, : 40 - 45
  • [49] A parallel algorithm for incremental compact clustering
    Gil-García, R
    Badía-Contelles, JM
    Pons-Porrata, A
    EURO-PAR 2003 PARALLEL PROCESSING, PROCEEDINGS, 2003, 2790 : 310 - 317
  • [50] Power Efficient Clustering Algorithm
    Al-Hamadi, Hasan
    Safar, Maytham
    Ebrahimi, Dariush
    2009 FOURTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT, 2009, : 58 - +