Near-Optimal Clustering in the k-machine model

被引:9
|
作者
Bandyapadhyay, Sayan [1 ]
Inamdar, Tanmay [1 ]
Pai, Shreyas [1 ]
Pemmaraju, Sriram V. [1 ]
机构
[1] Univ Iowa, Iowa City, IA 52242 USA
关键词
Clustering; Facility location; k-median; k-center; k-machine model; large-scale clustering; distributed clustering;
D O I
10.1145/3154273.3154317
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The clustering problem, in its many variants, has numerous applications in operations research and computer science (e.g., in applications in bioinformatics, image processing, social network analysis, etc.). As sizes of data sets have grown rapidly, researchers have focused on designing algorithms for clustering problems in models of computation suited for large-scale computation such as MapReduce, Pregel, and streaming models. The k-machine model (Klauck et al., SODA 2015) is a simple, message-passing model for large-scale distributed graph processing. This paper considers three of the most prominent examples of clustering problems: the uncapacitated facility location problem, the p-median problem, and the p-center problem and presents O(1)-factor approximation algorithms for these problems running in (O) over tilde (n/k) rounds in the k-machine model. These algorithms are optimal upto polylogarithmic factors because this paper also shows (Omega) over tilde (n/k) lower bounds for obtaining poly(n)-factor approximation algorithms for these problems. These are the first results for clustering problems in the k-machine model. We assume that the metric provided as input for these clustering problems in only implicitly provided, as an edge-weighted graph and in a nutshell, our main technical contribution is to show that constant-factor approximation algorithms for all three clustering problems can be obtained by learning only a small portion of the input metric.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Near-optimal clustering in the k-machine model
    Bandyapadhyay, Sayan
    Inamdar, Tanmay
    Pai, Shreyas
    Pemmaraju, Sriram, V
    THEORETICAL COMPUTER SCIENCE, 2022, 899 : 80 - 97
  • [2] Near-Optimal k-Clustering in the Sliding Window Model
    Woodruff, David P.
    Zhong, Peilin
    Zhou, Samson
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [3] Near-Optimal Private and Scalable k-Clustering
    Cohen-Addad, Vincent
    Epasto, Alessandro
    Mirrokni, Vahab
    Narayanan, Shyam
    Zhong, Peilin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [4] Near-optimal large-scale k-medoids clustering
    Ushakov, Anton V.
    Vasilyev, Igor
    INFORMATION SCIENCES, 2021, 545 : 344 - 362
  • [5] Near-Optimal Correlation Clustering with Privacy
    Cohen-Addad, Vincent
    Fan, Chenglin
    Lattanzi, Silvio
    Mitrovic, Slobodan
    Norouzi-Fard, Ashkan
    Parotsidis, Nikos
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [6] Near-Optimal Comparison Based Clustering
    Perrot, Michael
    Esser, Pascal Mattia
    Ghoshdastidar, Debarghya
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [7] Implementation of a Near-Optimal Complex Root Clustering Algorithm
    Imbach, Remi
    Pan, Victor Y.
    Yap, Chee
    MATHEMATICAL SOFTWARE - ICMS 2018, 2018, 10931 : 235 - 244
  • [8] Optimal and near-optimal algorithms for k-item broadcast
    Santos, EE
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1999, 57 (02) : 121 - 139
  • [9] Efficient Distributed Algorithms in the k-machine model via PRAM Simulations
    Augustine, John
    Kothapalli, Kishore
    Pandurangan, Gopal
    2021 IEEE 35TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2021, : 223 - 232
  • [10] Onion Curve: A Space Filling Curve with Near-Optimal Clustering
    Xu, Pan
    Cuong Nguyen
    Tirthapura, Srikanta
    2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2018, : 1236 - 1239