Learning-Based Sample Tuning for Approximate Query Processing in Interactive Data Exploration

被引:0
|
作者
Zhang, Hanbing [1 ]
Jing, Yinan [1 ]
He, Zhenying [1 ]
Zhang, Kai [1 ]
Wang, X. Sean [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai 200437, Peoples R China
基金
中国国家自然科学基金;
关键词
Measurement; Adaptation models; Costs; Tuners; Accuracy; Q-learning; Query processing; Optimization; Synthetic data; Approximate query processing; interactive data exploration; data analysis;
D O I
10.1109/TKDE.2023.3341451
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For interactive data exploration, approximate query processing (AQP) is a useful approach that usually uses samples to provide a timely response for queries by trading query accuracy. Existing AQP systems often materialize samples in the memory for reuse to speed up query processing. How to tune the samples according to the workload is one of the key problems in AQP. However, since the data exploration workload is so complex that it cannot be accurately predicted, existing sample tuning approaches cannot adapt to the changing workload very well. To address this problem, this paper proposes a deep reinforcement learning-based sample tuner, RL-STuner. When tuning samples, RL-STuner considers the workload changes from a global perspective and uses a Deep Q-learning Network (DQN) model to select an optimal sample set that has the maximum utility for the current workload. In addition, this paper proposes a set of optimization mechanisms to reduce the sample tuning cost. Experimental results on both real-world and synthetic datasets show that RL-STuner outperforms the existing sample tuning approaches and achieves 1.6x-5.2x improvements on query accuracy with a low tuning cost.
引用
收藏
页码:6532 / 6546
页数:15
相关论文
共 50 条
  • [31] A learning-based framework for spatial join processing: estimation, optimization and tuning
    Vu, Tin
    Belussi, Alberto
    Migliorini, Sara
    Eldawy, Ahmed
    VLDB JOURNAL, 2024, 33 (04): : 1155 - 1177
  • [32] An analysis of query-agnostic sampling for interactive data exploration
    Liu, Wenzhao
    Diao, Yanlei
    Liu, Anna
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2018, 47 (16) : 3820 - 3837
  • [33] Efficiently processing deterministic approximate aggregation query on massive data
    Xixian Han
    Bailing Wang
    Jianzhong Li
    Hong Gao
    Knowledge and Information Systems, 2018, 57 : 437 - 473
  • [34] Efficiently processing deterministic approximate aggregation query on massive data
    Han, Xixian
    Wang, Bailing
    Li, Jianzhong
    Gao, Hong
    KNOWLEDGE AND INFORMATION SYSTEMS, 2018, 57 (02) : 437 - 473
  • [35] A Session-Based Approach to Fast-But-Approximate Interactive Data Cube Exploration
    Kamat, Niranjan
    Nandi, Arnab
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2018, 12 (01)
  • [36] Deep Learning-Based Enhancement of Small Sample Liquefaction Data
    Chen, Mingyue
    Kang, Xin
    Ma, Xiongying
    INTERNATIONAL JOURNAL OF GEOMECHANICS, 2023, 23 (09)
  • [37] Deep learning-based real-time query processing for wireless sensor network
    Lee, Ki-Seong
    Lee, Sun-Ro
    Kim, Youngmin
    Lee, Chan-Gun
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2017, 13 (05):
  • [38] Learning-based Automatic Parameter Tuning for Big Data Analytics Frameworks
    Bao, Liang
    Liu, Xin
    Chen, Weizhao
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 181 - 190
  • [39] Flex Query: An Online Query System for Interactive Remote Visual Data Exploration at Large Scale
    Zou, Hongbo
    Schwan, Karsten
    Slawinska, Magdalena
    Wolf, Matt
    Eisenhauer, Greg
    Zheng, Fang
    Dayal, Jai
    Logan, Jeremy
    Liu, Qing
    Klasky, Scott
    Bode, Tanja
    Clark, Michael
    Kinsey, Matt
    2013 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2013,
  • [40] Learning-Based SPARQL Query Performance Prediction
    Zhang, Wei Emma
    Sheng, Quan Z.
    Taylor, Kerry
    Qin, Yongrui
    Yao, Lina
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2016, PT I, 2016, 10041 : 313 - 327