Improving hash join performance through prefetching

被引:67
作者
Chen, Shimin
Ailamaki, Anastassia
Gibbons, Phillip B.
Mowry, Todd C.
机构
[1] Intel Res Pittsburgh, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Dept Comp Sci, Sch Comp Sci, Pittsburgh, PA 15213 USA
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2007年 / 32卷 / 03期
关键词
algorithms; design; performance; hash join; CPU cache performance; CPU cache prefetching; group prefetching; software-pipelined prefetching;
D O I
10.1145/1272743.1272747
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hash join algorithms suffer from extensive CPU cache stalls. This article shows that the standard hash join algorithm for disk-oriented databases (i.e. GRACE) spends over 80% of its user time stalled on CPU cache misses, and explores the use of CPU cache prefetching to improve its cache performance. Applying prefetching to hash joins is complicated by the data dependencies, multiple code paths, and inherent randomness of hashing. We present two techniques, group prefetching and software-pipelined prefetching, that overcome these complications. These schemes achieve 1.29-4.04X speedups for the join phase and 1.37-3.49X speedups for the partition phase over GRACE and simple prefetching approaches. Moreover, compared with previous cache-aware approaches (i.e. cache partitioning), the schemes are at least 36% faster on large relations and do not require exclusive use of the CPU cache to be effective. Finally, comparing the elapsed real times when disk I/Os are in the picture, our cache prefetching schemes achieve 1.12-1.84X speedups for the join phase and 1.06-1.60X speedups for the partition phase over the GRACE hash join algorithm.
引用
收藏
页数:36
相关论文
共 35 条
[1]  
[Anonymous], P 1 INT WORKSH DAT M
[2]  
[Anonymous], TPC BENCHMARKS
[3]  
[Anonymous], 1991, P ACM IEEE C SUP SUP
[4]  
[Anonymous], P ACM SIGMOD INT C M
[5]  
Boncz P, 1999, PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, P54
[6]  
CHEN S, 2005, THESIS CARNEGIE MELL
[7]  
CHEN S, 2001, P 2001 ACM SIGMOD IN, P235
[8]  
Chen S., 2005, P 31 INT C VER LARG, P817
[9]   Improving hash join performance through prefetching [J].
Chen, SM ;
Ailamaki, A ;
Gibbons, PB ;
Mowry, TC .
20TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2004, :116-127
[10]   QUERY EVALUATION TECHNIQUES FOR LARGE DATABASES [J].
GRAEFE, G .
COMPUTING SURVEYS, 1993, 25 (02) :73-170