Parallel random swap: An efficient and reliable clustering algorithm in java']java

被引:6
|
作者
Nigro, Libero [1 ]
Cicirelli, Franco [2 ]
Fra, Pasi [3 ]
机构
[1] Univ Calabria, DIMES Dept Informat Modelling Elect & Syst Sci, I-87036 Arcavacata Di Rende, Italy
[2] CNR Natl Res Council Italy, Inst High Performance Comp & Networking ICAR Rende, I-87036 Arcavacata Di Rende, Italy
[3] Univ Eastern Finland, Sch Comp, Machine Learning Grp, POB 111, Joensuu 80101, Finland
关键词
Clustering problem; K; -means; Random swap; Parallelism; !text type='Java']Java[!/text; Streams; Lambda expressions; Actors; Multi -core machines; K-MEANS ALGORITHM; OPTIMIZATION;
D O I
10.1016/j.simpat.2022.102712
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Solving large-scale clustering problems requires an efficient algorithm that can also be implemented in parallel. K-means would be suitable, but it can lead to an inaccurate clustering result. To overcome this problem, we present a parallel version of the random swap clustering algorithm. It combines the scalability of k-means with the high clustering accuracy of random swap. The algorithm is implemented in Java in two ways. The first implementation uses Java parallel streams and lambda expressions. The solution exploits a built-in multi-threaded organization capable of offering competitive speedup. The second implementation is achieved on top of the Theatre actor system which ensures better scalability and high-performance computing through fine-grain resource control. The two implementations are then applied to standard benchmark datasets, with a varying population size and distribution of managed records, dimensionality of data points and the number of clusters. The experimental results confirm that high-quality clustering can be obtained together with a very good execution efficiency. Our Java code is publicly available at: https://github.com/uef-machine-learning.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] An Approach to Concurrent/Parallel Programming in Java']Java
    Cicirelli, Franco
    Nigro, Christian
    Nigro, Libero
    2015 IEEE 13th International Scientific Conference on Informatics, 2015, : 61 - 66
  • [42] Parallel firm mutation of Java']Java programs
    Jackson, D
    Woodward, MR
    MUTATION TESTING FOR THE NEW CENTURY, 2001, 24 : 55 - 61
  • [43] Efficient object serialization in Java']Java
    Opyrchal, L
    Prakash, A
    19TH IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOP, PROCEEDINGS, 1999, : 96 - 101
  • [44] An Efficient Memory System for Java']Java
    Li, Richard C. L.
    Fong, Anthony S. S.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2007, 7 (06): : 146 - 154
  • [45] Implementing an efficient Java']Java interpreter
    Gregg, D
    Ertl, MA
    Krall, A
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 2001, 2110 : 613 - 620
  • [46] Efficient object querying for Java']Java
    Willis, Darren
    Pearce, David J.
    Noble, James
    ECOOP 2006 - OBJECT-ORIENTED PROGRAMMING, PROCEEDINGS, 2006, 4067 : 28 - 49
  • [47] Compact and efficient strings for Java']Java
    Haeubl, Christian
    Wimmer, Christian
    Moessenboeck, Hanspeter
    SCIENCE OF COMPUTER PROGRAMMING, 2010, 75 (11) : 1077 - 1094
  • [48] Design of an Introductory Java']Java Parallel Programming Course for Non-Java']Java Students
    Chen, Xuguang
    2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 1746 - 1749
  • [49] Using a distributed active tree in Java']Java for the parallel and distributed implementation of a nested optimization algorithm
    Moritsch, HW
    Pflug, GC
    2003 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS, PROCEEDINGS, 2003, : 244 - 251
  • [50] On the parallelization and performance analysis of Barnes-Hut algorithm using Java']Java parallel platforms
    Munier, Badri
    Aleem, Muhammad
    Khan, Majid
    Islam, Muhammad Arshad
    Iqbal, Muhammad Azhar
    Khattak, Muhammad Kamran
    SN APPLIED SCIENCES, 2020, 2 (04):