Learning-Based Sample Tuning for Approximate Query Processing in Interactive Data Exploration

被引:0
|
作者
Zhang, Hanbing [1 ]
Jing, Yinan [1 ]
He, Zhenying [1 ]
Zhang, Kai [1 ]
Wang, X. Sean [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai 200437, Peoples R China
基金
中国国家自然科学基金;
关键词
Measurement; Adaptation models; Costs; Tuners; Accuracy; Q-learning; Query processing; Optimization; Synthetic data; Approximate query processing; interactive data exploration; data analysis;
D O I
10.1109/TKDE.2023.3341451
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For interactive data exploration, approximate query processing (AQP) is a useful approach that usually uses samples to provide a timely response for queries by trading query accuracy. Existing AQP systems often materialize samples in the memory for reuse to speed up query processing. How to tune the samples according to the workload is one of the key problems in AQP. However, since the data exploration workload is so complex that it cannot be accurately predicted, existing sample tuning approaches cannot adapt to the changing workload very well. To address this problem, this paper proposes a deep reinforcement learning-based sample tuner, RL-STuner. When tuning samples, RL-STuner considers the workload changes from a global perspective and uses a Deep Q-learning Network (DQN) model to select an optimal sample set that has the maximum utility for the current workload. In addition, this paper proposes a set of optimization mechanisms to reduce the sample tuning cost. Experimental results on both real-world and synthetic datasets show that RL-STuner outperforms the existing sample tuning approaches and achieves 1.6x-5.2x improvements on query accuracy with a low tuning cost.
引用
收藏
页码:6532 / 6546
页数:15
相关论文
共 50 条
  • [41] Learning-based Query Performance Modeling and Prediction
    Akdere, Mert
    Cetintemel, Ugur
    Riondato, Matteo
    Upfal, Eli
    Zdonik, Stanley B.
    2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, : 390 - 401
  • [42] AQP plus plus : Connecting Approximate Query Processing With Aggregate Precomputation for Interactive Analytics
    Peng, Jinglin
    Zhang, Dongxiang
    Wang, Jiannan
    Pei, Jian
    SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 1477 - 1492
  • [43] Accelerating Graph Processing With Lightweight Learning-Based Data Reordering
    Zou, Mo
    Zhang, Mingzhe
    Wang, Rujia
    Sun, Xian-He
    Ye, Xiaochun
    Fan, Dongrui
    Tang, Zhimin
    IEEE COMPUTER ARCHITECTURE LETTERS, 2022, 21 (01) : 5 - 8
  • [44] A Learning-Based Approach for Evaluating the Capacity of Data Processing Pipelines
    Alsayasneh, Maha
    De Palma, Noel
    EURO-PAR 2020: PARALLEL PROCESSING, 2020, 12247 : 52 - 67
  • [45] DBEst: Revisiting Approximate Query Processing Engines with Machine Learning Models
    Ma, Qingzhi
    Triantafillou, Peter
    SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 1553 - 1570
  • [46] An approach to content-based approximate query processing in peer-to-peer data systems
    Wang, CK
    Li, JZ
    Shi, SF
    GRID AND COOPERATIVE COMPUTING, PT 1, 2004, 3032 : 348 - 355
  • [47] A data model for approximate query processing of real-time databases
    Vrbsky, SV
    DATA & KNOWLEDGE ENGINEERING, 1996, 21 (01) : 79 - 102
  • [48] Data model for approximate query processing of real-time databases
    The Univ of Alabama, Tuscaloosa, United States
    Data Knowl Eng, 1 (79-102):
  • [49] Searching the Semantic Web: Approximate query processing based on ontologies
    Corby, O
    Dieng-Kuntz, R
    Gandon, F
    Faron-Zucker, C
    IEEE INTELLIGENT SYSTEMS, 2006, 21 (01) : 20 - 27
  • [50] An Online Approximate Aggregation Query Processing Method Based on Hadoop
    Zhang, Zhiqiang
    Hu, Jianghua
    Xie, Xiaoqin
    Pan, Haiwei
    Feng, Xiaoning
    2016 IEEE 20TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2016, : 117 - 122