Distributed Wavelet Thresholding for Maximum Error Metrics

被引:4
|
作者
Mytilinis, Ioannis [1 ]
Tsoumakos, Dimitrios [2 ]
Koziris, Nectarios [1 ]
机构
[1] Natl Tech Univ Athens, Comp Syst Lab, Athens, Greece
[2] Ionian Univ, Dept Informat, Corfu, Greece
关键词
D O I
10.1145/2882903.2915230
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern data analytics involve simple and complex computations over enormous numbers of data records. The volume of data and the increasingly stringent response-time requirements place increasing emphasis on the efficiency of approximate query processing. A major challenge over the past years has been the efficient construction of fixed-space synopses that provide a deterministic quality guarantee, often expressed in terms of a maximum error metric. For data reduction, wavelet decomposition has proved to be a very effective tool, as it can successfully approximate sharp discontinuities and provide accurate answers to queries. However, existing polynomial time wavelet thresholding schemes that minimize maximum error metrics are constrained with impractical time and space complexities for large datasets. In order to provide a practical solution to the problem, we develop parallel algorithms that take advantage of key properties of the wavelet decomposition and allocate tasks to multiple workers. To that end, we present i) a general framework for the parallelization of existing dynamic programming algorithms, ii) a parallel version of one such DP-based algorithm and iii) a new parallel greedy algorithm for the problem. To the best of our knowledge, this is the first attempt to scale algorithms for wavelet thresholding for maximum error metrics via a state-of-the-art distributed run-time. Our extensive experiments on both real and synthetic datasets over Hadoop show that the proposed algorithms achieve linear scalability and superior running-time performance compared to their centralized counterparts. Furthermore, our distributed greedy algorithm outperforms the distributed version of the current state-of-the-art dynamic programming algorithm by 2 to 4 times, without compromising the quality of results.
引用
收藏
页码:663 / 677
页数:15
相关论文
共 50 条
  • [11] A priori discretization error metrics for distributed hydrologic modeling applications
    Liu, Hongli
    Tolson, Bryan A.
    Craig, James R.
    Shafii, Mahyar
    JOURNAL OF HYDROLOGY, 2016, 543 : 873 - 891
  • [12] MINIMUM ERROR THRESHOLDING
    KITTLER, J
    ILLINGWORTH, J
    PATTERN RECOGNITION, 1986, 19 (01) : 41 - 47
  • [13] A study of wavelet thresholding denoising
    Guo, DF
    Zhu, WH
    Gao, ZM
    Zhang, JQ
    2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 329 - 332
  • [14] Wavelet thresholding of multivalued images
    Scheunders, P
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2004, 13 (04) : 475 - 483
  • [15] Adaptive thresholding of wavelet coefficients
    Abramovich, Felix
    Benjamini, Yoav
    Computational Statistics and Data Analysis, 1996, 22 (04): : 351 - 361
  • [16] Total Variation Wavelet Thresholding
    Tony F. Chan
    Hao-Min Zhou
    Journal of Scientific Computing, 2007, 32 : 315 - 341
  • [17] Anisotropic wavelet bases and thresholding
    Hochmuth, Reinhard
    MATHEMATISCHE NACHRICHTEN, 2007, 280 (5-6) : 523 - 533
  • [18] Filtered wavelet thresholding methods
    Bacchelli, S
    Papi, S
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2004, 164 : 39 - 52
  • [19] Adaptive thresholding of wavelet coefficients
    Abramovich, F
    Benjamini, Y
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1996, 22 (04) : 351 - 361
  • [20] Total variation wavelet thresholding
    Chan, Tony F.
    Zhou, Hao-Min
    JOURNAL OF SCIENTIFIC COMPUTING, 2007, 32 (02) : 315 - 341