Parallel random swap: An efficient and reliable clustering algorithm in java']java

被引:6
|
作者
Nigro, Libero [1 ]
Cicirelli, Franco [2 ]
Fra, Pasi [3 ]
机构
[1] Univ Calabria, DIMES Dept Informat Modelling Elect & Syst Sci, I-87036 Arcavacata Di Rende, Italy
[2] CNR Natl Res Council Italy, Inst High Performance Comp & Networking ICAR Rende, I-87036 Arcavacata Di Rende, Italy
[3] Univ Eastern Finland, Sch Comp, Machine Learning Grp, POB 111, Joensuu 80101, Finland
关键词
Clustering problem; K; -means; Random swap; Parallelism; !text type='Java']Java[!/text; Streams; Lambda expressions; Actors; Multi -core machines; K-MEANS ALGORITHM; OPTIMIZATION;
D O I
10.1016/j.simpat.2022.102712
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Solving large-scale clustering problems requires an efficient algorithm that can also be implemented in parallel. K-means would be suitable, but it can lead to an inaccurate clustering result. To overcome this problem, we present a parallel version of the random swap clustering algorithm. It combines the scalability of k-means with the high clustering accuracy of random swap. The algorithm is implemented in Java in two ways. The first implementation uses Java parallel streams and lambda expressions. The solution exploits a built-in multi-threaded organization capable of offering competitive speedup. The second implementation is achieved on top of the Theatre actor system which ensures better scalability and high-performance computing through fine-grain resource control. The two implementations are then applied to standard benchmark datasets, with a varying population size and distribution of managed records, dimensionality of data points and the number of clusters. The experimental results confirm that high-quality clustering can be obtained together with a very good execution efficiency. Our Java code is publicly available at: https://github.com/uef-machine-learning.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Parallel programming with Easy Java']Java Simulations
    Esquembre, F.
    Christian, W.
    Belloni, M.
    AMERICAN JOURNAL OF PHYSICS, 2018, 86 (01) : 54 - 67
  • [32] JPVM: network parallel computing in Java']Java
    Ferrari, A
    CONCURRENCY-PRACTICE AND EXPERIENCE, 1998, 10 (11-13): : 985 - 992
  • [33] Scriptic: Parallel programming in extended Java']Java
    vanDelft, A
    PARALLEL PROGRAMMING AND JAVA, 1997, 50 : 17 - 33
  • [34] Generation of distributed parallel Java']Java programs
    Launay, P
    Pazat, JL
    EURO-PAR '98 PARALLEL PROCESSING, 1998, 1470 : 729 - 732
  • [35] Safe Parallel Programming with Session Java']Java
    Ng, Nicholas
    Yoshida, Nobuko
    Pernet, Olivier
    Hu, Raymond
    Kryftis, Yiannos
    COORDINATION MODELS AND LANGUAGES, COORDINATION 2011, 2011, 6721 : 110 - 126
  • [36] Easing parallel programming for clusters with Java']Java
    Launay, P
    Pazat, JL
    FUTURE GENERATION COMPUTER SYSTEMS, 2001, 18 (02) : 253 - 263
  • [37] Parallel graph coloring using JAVA']JAVA
    Umland, T
    ARCHITECTURES, LANGUAGES AND PATTERNS FOR PARALLEL AND DISTRIBUTED APPLICATIONS, 1998, 52 : 211 - 217
  • [38] Embarrassingly parallel applications on a Java']Java cluster
    Vinter, B
    HIGH PERFORMANCE COMPUTING AND NETWORKING, PROCEEDINGS, 2000, 1823 : 614 - 617
  • [39] Teaching Parallel Programming Using Java']Java
    Shafi, Aamir
    Akhtar, Aleem
    Javed, Ansar
    Carpenter, Bryan
    2014 WORKSHOP ON EDUCATION FOR HIGH PERFORMANCE COMPUTING (EDUHPC), 2014, : 56 - 63
  • [40] Aspect Oriented Parallel Framework for Java']Java
    Medeiros, Bruno
    Sobral, Joao L.
    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2016, 2017, 10150 : 220 - 233