Improving Distance-Join Query processing with Voronoi-Diagram based partitioning in SpatialHadoop

被引:12
|
作者
Garcia-Garcia, Francisco [1 ]
Corral, Antonio [1 ]
Iribarne, Luis [1 ]
Vassilakopoulos, Michael [2 ]
机构
[1] Univ Almeria, Dept Informat, Almeria, Spain
[2] Univ Thessaly, Dept Elect & Comp Engn, Volos, Greece
关键词
Data partitioning; K nearest neighbors join; K closest pairs; SpatialHadoop; MapReduce; Spatial query evaluation; ALGORITHMS;
D O I
10.1016/j.future.2019.10.037
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
SpatialHadoop is an extended MapReduce framework supporting global indexing techniques that partition spatial datasets across several machines and improve spatial query processing performance compared to traditional Hadoop systems. SpatialHadoop supports several spatial operations (e.g., K Nearest Neighbor search, range query, spatial intersection join, etc.) and seven spatial partitioning techniques (Grid, Quadtree, STR, STR+, k-d tree, Z-curve and Hilbert-curve). Distance-Join Queries (DJQs), like the K Nearest Neighbors Join Query (KNNJQ) and K Closest Pairs Query (KCPQ), are common operations used in numerous spatial applications. DJQs are costly operations, since they combine spatial joins with distance-based search. Data partitioning improves the management of large datasets and speeds up query performance. Therefore, performing DJQs efficiently with new partitioning methods in SpatialHadoop is a challenging task. In this paper, a new data partitioning technique based on Voronoi-Diagrams is designed and implemented in SpatialHadoop. Moreover, improved KNNJQ and KCPQ MapReduce algorithms, using the new partitioning mechanism, are also designed and developed for SpatialHadoop. Finally, the results of an extensive set of experiments with real-world datasets are presented, demonstrating that the new partitioning technique and the improved DR MapReduce algorithms are efficient, scalable and robust in SpatialHadoop. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:723 / 740
页数:18
相关论文
共 50 条
  • [31] SMat-J: A Sparse Matrix-Based Join for SPARQL Query Processing
    Sun, Ximin
    Liu, Ming
    Wang, Shuai
    Li, Xiaoming
    Zheng, Bin
    Liu, Dan
    Yu, Hongshen
    WEB AND BIG DATA, 2021, 1505 : 16 - 26
  • [32] A Graph-based Database Partitioning Method for Parallel OLAP Query Processing
    Nam, Yoon-Min
    Kim, Min-Soo
    Han, Donghyoung
    2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2018, : 1025 - 1036
  • [33] Improving the performance of pipelined query processing with skipping-and its comparison to document-wise partitioning
    Jonassen, Simon
    Bratsberg, Svein Erik
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2014, 17 (05): : 949 - 967
  • [35] New fast algorithm for computing the distance between two disjoint convex polygons based on Voronoi diagram
    Yang C.-L.
    Qi M.
    Meng X.-X.
    Li X.-Q.
    Wang J.-Y.
    Journal of Zhejiang University-SCIENCE A, 2006, 7 (9): : 1522 - 1529
  • [36] A nodal integration and post-processing technique based on Voronoi diagram for Galerkin meshless methods
    Zhou, JX
    Wen, JB
    Zhang, HY
    Zhang, L
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2003, 192 (35-36) : 3831 - 3843
  • [37] An Efficient Two-Table Join Query Processing Based on Extended Bloom Filter in MapReduce
    Wang, Junlu
    Pang, Jun
    Li, Xiaoyan
    Han, Baishuo
    Huang, Lei
    Ding, Linlin
    WEB-AGE INFORMATION MANAGEMENT, 2016, 9998 : 249 - 258
  • [38] A Cache Based Multi-join Query Method with Two-Phase Processing in MANET
    Guo, Yahong
    Li, Jinbao
    Guo, Longjiang
    Zhu, Jinghua
    Liu, Xu
    WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, WASA 2012, 2012, 7405 : 450 - 461
  • [39] Improving Hamming distance-based fuzzy join in MapReduce using Bloom Filters
    Thi-To-Quyen Tran
    Thuong-Cang Phan
    Laurent, Anne
    D'Orazio, Laurent
    2018 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2018,
  • [40] A Term-Based Inverted Index Partitioning Model for Efficient Distributed Query Processing
    Cambazoglu, B. Barla
    Kayaaslan, Enver
    Jonassen, Simon
    Aykanat, Cevdet
    ACM TRANSACTIONS ON THE WEB, 2013, 7 (03)