Parallel random swap: An efficient and reliable clustering algorithm in java']java

被引:6
|
作者
Nigro, Libero [1 ]
Cicirelli, Franco [2 ]
Fra, Pasi [3 ]
机构
[1] Univ Calabria, DIMES Dept Informat Modelling Elect & Syst Sci, I-87036 Arcavacata Di Rende, Italy
[2] CNR Natl Res Council Italy, Inst High Performance Comp & Networking ICAR Rende, I-87036 Arcavacata Di Rende, Italy
[3] Univ Eastern Finland, Sch Comp, Machine Learning Grp, POB 111, Joensuu 80101, Finland
关键词
Clustering problem; K; -means; Random swap; Parallelism; !text type='Java']Java[!/text; Streams; Lambda expressions; Actors; Multi -core machines; K-MEANS ALGORITHM; OPTIMIZATION;
D O I
10.1016/j.simpat.2022.102712
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Solving large-scale clustering problems requires an efficient algorithm that can also be implemented in parallel. K-means would be suitable, but it can lead to an inaccurate clustering result. To overcome this problem, we present a parallel version of the random swap clustering algorithm. It combines the scalability of k-means with the high clustering accuracy of random swap. The algorithm is implemented in Java in two ways. The first implementation uses Java parallel streams and lambda expressions. The solution exploits a built-in multi-threaded organization capable of offering competitive speedup. The second implementation is achieved on top of the Theatre actor system which ensures better scalability and high-performance computing through fine-grain resource control. The two implementations are then applied to standard benchmark datasets, with a varying population size and distribution of managed records, dimensionality of data points and the number of clusters. The experimental results confirm that high-quality clustering can be obtained together with a very good execution efficiency. Our Java code is publicly available at: https://github.com/uef-machine-learning.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Data parallel skeletons in Java']Java
    Kuchen, Herbert
    Ernsting, Steffen
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2012, 2012, 9 : 1817 - 1826
  • [22] Centroid Ratio for a Pairwise Random Swap Clustering Algorithm
    Zhao, Qinpei
    Franti, Pasi
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (05) : 1090 - 1101
  • [23] Efficient Java']Java™ monitors
    Blomdell, A
    FOURTH IEEE INTERNATIONAL SYMPOSIUM ON OBJECT-ORIENTED REAL-TIME DISTRIBUTED COMPUTING, PROCEEDINGS, 2001, : 270 - 276
  • [24] Jcluster: an efficient Java']Java parallel environment on a large-scale heterogeneous cluster
    Zhang, Bao-Yin
    Yang, Guang-Wen
    Zheng, Wei-Min
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2006, 18 (12): : 1541 - 1557
  • [25] Efficient computation of May-Happen-in-Parallel information for concurrent Java']Java programs
    Barik, Rajkishore
    LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2006, 4339 : 152 - 169
  • [26] A Java']Java simulation tool for fuzzy clustering
    Egan, MA
    Krishnamoorthy, M
    Rajan, K
    CONCURRENCY-PRACTICE AND EXPERIENCE, 1997, 9 (11): : 1327 - 1332
  • [27] Refactoring Clustering in Java']Java Software Networks
    Concas, Giulio
    Monni, C.
    Orru, M.
    Ortu, M.
    Tonelli, Roberto
    AGILE METHODS: LARGE-SCALE DEVELOPMENT, REFACTORING, TESTING, AND ESTIMATION, 2014, 199 : 121 - 135
  • [28] Java']Javalanche: Efficient Mutation Testing for Java']Java
    Schuler, David
    Zeller, Andreas
    7TH JOINT MEETING OF THE EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND THE ACM SIGSOFT SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2009, : 297 - 298
  • [29] A reliable group communication system for Java']Java application
    Moon, ND
    Lee, MJ
    KORUS 2003: 7TH KOREA-RUSSIA INTERNATIONAL SYMPOSIUM ON SCIENCE AND TECHNOLOGY, VOL 2, PROCEEDINGS: ELECTRICAL ENGINEERING AND INFORMATION TECHNOLOGY, 2003, : 368 - 372
  • [30] Teaching Parallel Programming with Java']Java and Pyjama
    Kurniawati, Ruth
    PROCEEDINGS OF THE 53RD ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION (SIGCSE 2022), VOL 2, 2022, : 1109 - 1109