The Communication Complexity of Distributed ε-Approximations

被引:1
|
作者
Huang, Zengfeng [1 ]
Yi, Ke [2 ]
机构
[1] Aarhus Univ, MADALGO, DK-8000 Aarhus C, Denmark
[2] HKUST, Dept CSE, Hong Kong, Hong Kong, Peoples R China
关键词
epsilon-approximations; communication complexity; discrepancy; distributed data; BOUNDS;
D O I
10.1109/FOCS.2014.69
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data summarization is an effective approach to dealing with the "big data" problem. While data summarization problems traditionally have been studied is the streaming model, the focus is starting to shift to distributed models, as distributed/parallel computation seems to be the only viable way to handle today's massive data sets. In this paper, we study epsilon-approximations, a classical data summary that, intuitively speaking, preserves approximately the density of the underlying data set over a certain range space. We consider the problem of computing epsilon-approximations for a data set which is held jointly by k players, and give general communication upper and lower bounds that hold for any range space whose discrepancy is known.
引用
收藏
页码:591 / 600
页数:10
相关论文
共 50 条