Near-Optimal Clustering in the k-machine model

被引：9

作者：

Bandyapadhyay, Sayan ^{[1
]}

Inamdar, Tanmay ^{[1
]}

Pai, Shreyas ^{[1
]}

Pemmaraju, Sriram V. ^{[1
]}

机构：

[1] Univ Iowa, Iowa City, IA 52242 USA

来源：

ICDCN'18: PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING AND NETWORKING | 2018年

关键词：

Clustering; Facility location; k-median; k-center; k-machine model; large-scale clustering; distributed clustering;

D O I：

10.1145/3154273.3154317

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The clustering problem, in its many variants, has numerous applications in operations research and computer science (e.g., in applications in bioinformatics, image processing, social network analysis, etc.). As sizes of data sets have grown rapidly, researchers have focused on designing algorithms for clustering problems in models of computation suited for large-scale computation such as MapReduce, Pregel, and streaming models. The k-machine model (Klauck et al., SODA 2015) is a simple, message-passing model for large-scale distributed graph processing. This paper considers three of the most prominent examples of clustering problems: the uncapacitated facility location problem, the p-median problem, and the p-center problem and presents O(1)-factor approximation algorithms for these problems running in (O) over tilde (n/k) rounds in the k-machine model. These algorithms are optimal upto polylogarithmic factors because this paper also shows (Omega) over tilde (n/k) lower bounds for obtaining poly(n)-factor approximation algorithms for these problems. These are the first results for clustering problems in the k-machine model. We assume that the metric provided as input for these clustering problems in only implicitly provided, as an edge-weighted graph and in a nutshell, our main technical contribution is to show that constant-factor approximation algorithms for all three clustering problems can be obtained by learning only a small portion of the input metric.

引用

页数：10

共 50 条

[1] Near-optimal clustering in the k-machine model
Bandyapadhyay, Sayan
Inamdar, Tanmay
Pai, Shreyas
Pemmaraju, Sriram, V
THEORETICAL COMPUTER SCIENCE, 2022, 899 : 80 - 97
[2] Near-Optimal k-Clustering in the Sliding Window Model
Woodruff, David P.
Zhong, Peilin
Zhou, Samson
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[3] Near-Optimal Private and Scalable k-Clustering
Cohen-Addad, Vincent
Epasto, Alessandro
Mirrokni, Vahab
Narayanan, Shyam
Zhong, Peilin
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[4] Near-optimal large-scale k-medoids clustering
Ushakov, Anton V.
Vasilyev, Igor
INFORMATION SCIENCES, 2021, 545 : 344 - 362
[5] Near-Optimal Correlation Clustering with Privacy
Cohen-Addad, Vincent
Fan, Chenglin
Lattanzi, Silvio
Mitrovic, Slobodan
Norouzi-Fard, Ashkan
Parotsidis, Nikos
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[6] Near-Optimal Comparison Based Clustering
Perrot, Michael
Esser, Pascal Mattia
Ghoshdastidar, Debarghya
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[7] Implementation of a Near-Optimal Complex Root Clustering Algorithm
Imbach, Remi
Pan, Victor Y.
Yap, Chee
MATHEMATICAL SOFTWARE - ICMS 2018, 2018, 10931 : 235 - 244
[8] Optimal and near-optimal algorithms for k-item broadcast
Santos, EE
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1999, 57 (02) : 121 - 139
[9] Efficient Distributed Algorithms in the k-machine model via PRAM Simulations
Augustine, John
Kothapalli, Kishore
Pandurangan, Gopal
2021 IEEE 35TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2021, : 223 - 232
[10] Onion Curve: A Space Filling Curve with Near-Optimal Clustering
Xu, Pan
Cuong Nguyen
Tirthapura, Srikanta
2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2018, : 1236 - 1239

← 1 2 3 4 5 →