OPTIMIZING EQUIJOIN QUERIES IN DISTRIBUTED DATABASES WHERE RELATIONS ARE HASH PARTITIONED

被引:27
|
作者
SHASHA, D [1 ]
WANG, TL [1 ]
机构
[1] NEW JERSEY INST TECHNOL,DEPT COMP & INFORMAT SCI,NEWARK,NJ 07102
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 1991年 / 16卷 / 02期
关键词
ALGORITHMS; PERFORMANCE; THEORY; EQUIJOIN; HASHING; NP-COMPLETE PROBLEMS; RELATIONAL DATA MODELS; SPANNING TREES; SYSTEMS;
D O I
10.1145/114325.103713
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Consider the class of distributed database systems consisting of a set of nodes connected by a high bandwidth network. Each node consists of a processor, a random access memory, and a slower but much larger memory such as a disk. There is no shared memory among the nodes. The data are horizontally partitioned often using a hash function. Such a description characterizes many parallel or distributed database systems that have recently been proposed, both commercial and academic. We study the optimization problem that arises when the query processor must repartition the relations and intermediate results participating in a multijoin query. Using estimates of the sizes of intermediate relations, we show (1) optimum solutions for closed chain queries; (2) the NP-completeness of the optimization problem for star, tree, and general graph queries; and (3) effective heuristics for these hard cases. Our general approach and many of our results extend to other attribute partitioning schemes, for example, sort-partitioning on attributes, and to partitioned object databases.
引用
收藏
页码:279 / 308
页数:30
相关论文
共 50 条
  • [11] Supporting quantified queries in distributed databases
    Badia, Antonio
    Dobbs, Michael
    INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2014, 29 (05) : 421 - 459
  • [12] Optimizing SQL queries over text databases
    Jain, Alpa
    Doan, AnHai
    Gravano, Luis
    2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 636 - +
  • [13] Bringing efficient advanced queries to distributed Hash Tables
    Bauer, D
    Hurley, P
    Pletka, R
    Waldvogel, M
    LCN 2004: 29TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON LOCAL COMPUTER NETWORKS, PROCEEDINGS, 2004, : 6 - 14
  • [14] DECOMPOSITION IN OPTIMIZING DISTRIBUTED JOIN QUERIES
    BODORIK, P
    RIORDON, JS
    COMPUTING AND INFORMATION, 1989, : 281 - 289
  • [15] Processing skyline queries in incomplete distributed databases
    Alwan, Ali A.
    Ibrahim, Hamidah
    Udzir, Nur Izura
    Sidi, Fatimah
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2017, 48 (02) : 399 - 420
  • [16] ON DISTRIBUTED PROCESSIBILITY OF DATALOG QUERIES BY DECOMPOSING DATABASES
    DONG, GZ
    PROCEEDINGS OF THE 1989 ACM SIGMOD INTERNATIONAL CONFERENCE ON THE MANAGEMENT OF DATA, 1989, 18 : 26 - 35
  • [17] Processing skyline queries in incomplete distributed databases
    Ali A. Alwan
    Hamidah Ibrahim
    Nur Izura Udzir
    Fatimah Sidi
    Journal of Intelligent Information Systems, 2017, 48 : 399 - 420
  • [18] Optimizing multiple dimensional queries simultaneously in multidimensional databases
    Weifa Liang
    Maria E. Orlowska
    Jeffrey X. Yu
    The VLDB Journal, 2000, 8 : 319 - 338
  • [19] Optimizing multiple dimensional queries simultaneously in multidimensional databases
    Liang, WF
    Orlowska, ME
    Yu, JX
    VLDB JOURNAL, 2000, 8 (3-4): : 319 - 338
  • [20] Efficient Retrieval of Data from Cloud Databases using Hash Partitioned Buckets
    Nair, Abhishek M.
    Dewangan, Aman
    Mary, Geetha A.
    2019 INNOVATIONS IN POWER AND ADVANCED COMPUTING TECHNOLOGIES (I-PACT), 2019,