Distributed Wavelet Thresholding for Maximum Error Metrics

被引:4
|
作者
Mytilinis, Ioannis [1 ]
Tsoumakos, Dimitrios [2 ]
Koziris, Nectarios [1 ]
机构
[1] Natl Tech Univ Athens, Comp Syst Lab, Athens, Greece
[2] Ionian Univ, Dept Informat, Corfu, Greece
关键词
D O I
10.1145/2882903.2915230
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern data analytics involve simple and complex computations over enormous numbers of data records. The volume of data and the increasingly stringent response-time requirements place increasing emphasis on the efficiency of approximate query processing. A major challenge over the past years has been the efficient construction of fixed-space synopses that provide a deterministic quality guarantee, often expressed in terms of a maximum error metric. For data reduction, wavelet decomposition has proved to be a very effective tool, as it can successfully approximate sharp discontinuities and provide accurate answers to queries. However, existing polynomial time wavelet thresholding schemes that minimize maximum error metrics are constrained with impractical time and space complexities for large datasets. In order to provide a practical solution to the problem, we develop parallel algorithms that take advantage of key properties of the wavelet decomposition and allocate tasks to multiple workers. To that end, we present i) a general framework for the parallelization of existing dynamic programming algorithms, ii) a parallel version of one such DP-based algorithm and iii) a new parallel greedy algorithm for the problem. To the best of our knowledge, this is the first attempt to scale algorithms for wavelet thresholding for maximum error metrics via a state-of-the-art distributed run-time. Our extensive experiments on both real and synthetic datasets over Hadoop show that the proposed algorithms achieve linear scalability and superior running-time performance compared to their centralized counterparts. Furthermore, our distributed greedy algorithm outperforms the distributed version of the current state-of-the-art dynamic programming algorithm by 2 to 4 times, without compromising the quality of results.
引用
收藏
页码:663 / 677
页数:15
相关论文
共 50 条
  • [1] Scaling the Construction of Wavelet Synopses for Maximum Error Metrics
    Mytilinis, Ioannis
    Tsoumakos, Dimitrios
    Koziris, Nectarios
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (09) : 1794 - 1808
  • [2] Wavelet synopses for general error metrics
    Garofalakis, M
    Kumar, A
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2005, 30 (04): : 888 - 928
  • [3] On Multidimensional Wavelet Synopses for Maximum Error Bounds
    Zhang, Qing
    Pang, Chaoyi
    Hansen, David
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2009, 5463 : 646 - 661
  • [4] Estimation of the Average Error Probability for Calculating Wavelet Coefficients in the Hybrid Thresholding Method
    A. A. Kudryavtsev
    O. V. Shestakov
    Moscow University Computational Mathematics and Cybernetics, 2021, 45 (1) : 16 - 20
  • [5] Invesitgation and experiments of wavelet thresholding in ensemble-based background error variance
    Liu Bai-Nian
    Huang Qun-Bo
    Zhang Wei-Min
    Ren Kai-Jun
    Cao Xiao-Qun
    Zhao Jun
    ACTA PHYSICA SINICA, 2017, 66 (02)
  • [6] 'Analytic' wavelet thresholding
    Olhede, SC
    Walden, AT
    BIOMETRIKA, 2004, 91 (04) : 955 - 973
  • [7] Maximum similarity thresholding
    Zou, Yaobin
    Dong, Fangmin
    Lei, Bangjun
    Sun, Shuifa
    Jiang, Tingyao
    Chen, Peng
    DIGITAL SIGNAL PROCESSING, 2014, 28 : 120 - 135
  • [8] Maximum tolerable error bound in distributed simulated annealing
    Hong, Chul-Eui
    Ahn, Hee-Il
    McMillin, Bruce M.
    ETRI Journal, 1994, 15 (3-4) : 1 - 26
  • [9] Application of Improved Wavelet Thresholding Method and an RBF Network in the Error Compensating of an MEMS Gyroscope
    Sheng, Guangrun
    Gao, Guowei
    Zhang, Boyuan
    MICROMACHINES, 2019, 10 (09)
  • [10] On the pointwise mean squared error of a multidimensional term-by-term thresholding wavelet estimator
    Chesneau, Christophe
    Navarro, Fabien
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2017, 46 (11) : 5643 - 5655