Distributed Wavelet Thresholding for Maximum Error Metrics

被引：4

作者：

Mytilinis, Ioannis ^{[1
]}

Tsoumakos, Dimitrios ^{[2
]}

Koziris, Nectarios ^{[1
]}

机构：

[1] Natl Tech Univ Athens, Comp Syst Lab, Athens, Greece

[2] Ionian Univ, Dept Informat, Corfu, Greece

来源：

SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA | 2016年

关键词：

D O I：

10.1145/2882903.2915230

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Modern data analytics involve simple and complex computations over enormous numbers of data records. The volume of data and the increasingly stringent response-time requirements place increasing emphasis on the efficiency of approximate query processing. A major challenge over the past years has been the efficient construction of fixed-space synopses that provide a deterministic quality guarantee, often expressed in terms of a maximum error metric. For data reduction, wavelet decomposition has proved to be a very effective tool, as it can successfully approximate sharp discontinuities and provide accurate answers to queries. However, existing polynomial time wavelet thresholding schemes that minimize maximum error metrics are constrained with impractical time and space complexities for large datasets. In order to provide a practical solution to the problem, we develop parallel algorithms that take advantage of key properties of the wavelet decomposition and allocate tasks to multiple workers. To that end, we present i) a general framework for the parallelization of existing dynamic programming algorithms, ii) a parallel version of one such DP-based algorithm and iii) a new parallel greedy algorithm for the problem. To the best of our knowledge, this is the first attempt to scale algorithms for wavelet thresholding for maximum error metrics via a state-of-the-art distributed run-time. Our extensive experiments on both real and synthetic datasets over Hadoop show that the proposed algorithms achieve linear scalability and superior running-time performance compared to their centralized counterparts. Furthermore, our distributed greedy algorithm outperforms the distributed version of the current state-of-the-art dynamic programming algorithm by 2 to 4 times, without compromising the quality of results.

引用

页码：663 / 677

页数：15

共 50 条

[11] A priori discretization error metrics for distributed hydrologic modeling applications
Liu, Hongli
Tolson, Bryan A.
Craig, James R.
Shafii, Mahyar
JOURNAL OF HYDROLOGY, 2016, 543 : 873 - 891
[12] MINIMUM ERROR THRESHOLDING
KITTLER, J
ILLINGWORTH, J
PATTERN RECOGNITION, 1986, 19 (01) : 41 - 47
[13] A study of wavelet thresholding denoising
Guo, DF
Zhu, WH
Gao, ZM
Zhang, JQ
2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 329 - 332
[14] Wavelet thresholding of multivalued images
Scheunders, P
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2004, 13 (04) : 475 - 483
[15] Adaptive thresholding of wavelet coefficients
Abramovich, Felix
Benjamini, Yoav
Computational Statistics and Data Analysis, 1996, 22 (04): : 351 - 361
[16] Total Variation Wavelet Thresholding
Tony F. Chan
Hao-Min Zhou
Journal of Scientific Computing, 2007, 32 : 315 - 341
[17] Anisotropic wavelet bases and thresholding
Hochmuth, Reinhard
MATHEMATISCHE NACHRICHTEN, 2007, 280 (5-6) : 523 - 533
[18] Filtered wavelet thresholding methods
Bacchelli, S
Papi, S
JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2004, 164 : 39 - 52
[19] Adaptive thresholding of wavelet coefficients
Abramovich, F
Benjamini, Y
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1996, 22 (04) : 351 - 361
[20] Total variation wavelet thresholding
Chan, Tony F.
Zhou, Hao-Min
JOURNAL OF SCIENTIFIC COMPUTING, 2007, 32 (02) : 315 - 341

← 1 2 3 4 5 →