Synopses for query optimization: A space-complexity perspective

被引:6
|
作者
Kaushik, R
Naughton, JF
Ramakrishnan, R
Chakravarthy, VT
机构
[1] Microsoft Res, Redmond, WA 98052 USA
[2] Univ Wisconsin, Dept Comp Sci, Madison, WI 53705 USA
[3] IBM India Res Lab, New Delhi 110016, India
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2005年 / 30卷 / 04期
关键词
theory; performance; cardinality estimation; histograms; sampling;
D O I
10.1145/1114244.1114251
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Database systems use precomputed synopses of data to estimate the cost of alternative plans during query optimization. A number of alternative synopsis structures have been proposed, but histograms are by far the most commonly used. While histograms have proved to be very effective in (cost estimation for) single-table selections, queries with joins have long been seen as a challenge; under a model where histograms are maintained for individual tables, a celebrated result of Ioannidis and Christodoulakis [1991] observes that errors propagate exponentially with the number of joins in a query. In this article, we make two main contributions. First, we study the space complexity of using synopses for query optimization from a novel information-theoretic perspective. In particular, we offer evidence in support of histograms for single-table selections, including an analysis over data distributions known to be common in practice, and illustrate their limitations for join queries. Second, for a broad class of common queries involving joins (specifically, all queries involving only key-foreign key joins) we show that the strategy of storing a small precomputed sample of the database yields probabilistic guarantees that are almost space-optimal, which is an important property if these samples are to be used as database statistics. This is the first such optimality result, to our knowledge, and suggests that precomputed samples might be an effective way to circumvent the error propagation problem for queries with key-foreign key joins. We support this result empirically through an experimental study that demonstrates the effectiveness of precomputed samples, and also shows the increasing difference in the effectiveness of samples versus multidimensional histograms as the number of joins in the query grows.
引用
收藏
页码:1102 / 1127
页数:26
相关论文
共 50 条
  • [21] Technical Perspective: Query Optimization for Faster Deep CNN Explanations
    Schelter, Sebastian
    SIGMOD RECORD, 2020, 49 (01) : 60 - 60
  • [22] The Query Complexity of Certification
    Blanc, Guy
    Koch, Caleb
    Lange, Jane
    Tan, Li-Yang
    PROCEEDINGS OF THE 54TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING (STOC '22), 2022, : 623 - 636
  • [23] On the query complexity of sets
    Beigel, R
    Gasarch, W
    Kummer, M
    Martin, G
    McNicholl, T
    Stephan, F
    MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE 1996, 1996, 1113 : 206 - 217
  • [24] Query Complexity in Expectation
    Kaniewski, Jedrzej
    Lee, Troy
    de Wolf, Ronald
    AUTOMATA, LANGUAGES, AND PROGRAMMING, PT I, 2015, 9134 : 761 - 772
  • [25] Low space-complexity and low power semi-systolic multiplier architectures over GF(2m) based on irreducible trinomial
    Gebali, Fayez
    Ibrahim, Atef
    MICROPROCESSORS AND MICROSYSTEMS, 2016, 40 : 45 - 52
  • [26] From quantum query complexity to state complexity
    Zheng, Shenggen
    Qiu, Daowen
    Qiu, Daowen, 1600, Springer Verlag (8808): : 231 - 245
  • [27] Subquadratic Space-Complexity Digit-Serial Multipliers Over GF(2m) Using Generalized (a,b)-Way Karatsuba Algorithm
    Lee, Chiou-Yng
    Meher, Pramod Kumar
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2015, 62 (04) : 1091 - 1098
  • [28] On the query complexity of real functionals
    Feree, Hugo
    Hoyrup, Mathieu
    Gomaa, Walid
    2013 28TH ANNUAL IEEE/ACM SYMPOSIUM ON LOGIC IN COMPUTER SCIENCE (LICS), 2013, : 103 - 112
  • [29] On Exact Quantum Query Complexity
    Ashley Montanaro
    Richard Jozsa
    Graeme Mitchison
    Algorithmica, 2015, 71 : 775 - 796
  • [30] THE QUANTUM QUERY COMPLEXITY OF CERTIFICATION
    Ambainis, Andris
    Childs, Andrew M.
    Le Gall, Francois
    Tani, Seiichiro
    QUANTUM INFORMATION & COMPUTATION, 2010, 10 (3-4) : 181 - 189