VEDAS: an efficient GPU alternative for store and query of large RDF data sets

被引:3
|
作者
Makpaisit, Pisit [1 ]
Chantrapornchai, Chantana [1 ]
机构
[1] Kasetsart Univ, Dept Comp Engn, Bangkok, Thailand
关键词
Query processing; Parallel processing; Graphic Processing Units; Resource Description Framework; SPARQL; SPARQL QUERIES;
D O I
10.1186/s40537-021-00513-y
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Resource Description Framework (RDF) is commonly used as a standard for data interchange on the web. The collection of RDF data sets can form a large graph which consumes time to query. It is known that modern Graphic Processing Units (GPUs) can be employed to execute parallel programs in order to speedup the running time. In this paper, we propose a novel RDF data representation along with the query processing algorithm that is suitable for GPU processing. Since the main challenges of GPU architecture are the limited memory sizes, the memory transfer latency, and the vast number of GPU cores. Our system is designed to strengthen the use of GPU cores and reduce the effect of memory transfer. We propose a representation consists of indices and column-based RDF ID data that can reduce the GPU memory requirement. The indexing and pre-upload filtering techniques are then applied to reduce the data transfer between the host and GPU memory. We add the index swapping process to facilitate the sorting and joining data process based on the given variable and add the pre-upload step to reduce the size of results' storage, and the data transfer time. The experimental results show that our representation is about 35% smaller than the traditional NT format and 40% less compared to that of gStore. The query processing time can be speedup ranging from 1.95 to 397.03 when compared with RDF3X and gStore processing time with WatDiv test suite. It achieves speedup 578.57 and 62.97 for LUBM benchmark when compared to RDF-3X and gStore. The analysis shows the query cases which can gain benefits from our approach.
引用
收藏
页数:34
相关论文
共 50 条
  • [31] Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters
    Dashti, Ali
    Komarov, Ivan
    D'Souza, Roshan M.
    PLOS ONE, 2013, 8 (09):
  • [32] Efficient query processing for large XML data in distributed environments
    Kurita, Hiroto
    Hatano, Kenji
    Miyazaki, Jun
    Uemura, Shunsuke
    21ST INTERNATIONAL CONFERENCE ON ADVANCED NETWORKING AND APPLICATIONS, PROCEEDINGS, 2007, : 317 - +
  • [33] Parallel acceleration of CPU and GPU range queries over large data sets
    Mitchell Nelson
    Zachary Sorenson
    Joseph M. Myre
    Jason Sawin
    David Chiu
    Journal of Cloud Computing, 9
  • [34] Parallel acceleration of CPU and GPU range queries over large data sets
    Nelson, Mitchell
    Sorenson, Zachary
    Myre, Joseph M.
    Sawin, Jason
    Chiu, David
    JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS, 2020, 9 (01):
  • [35] Efficient co-triangulation of large data sets
    Weimer, H
    Warren, J
    Troutner, J
    Wiggins, W
    Shrout, J
    VISUALIZATION '98, PROCEEDINGS, 1998, : 119 - +
  • [36] Efficient nonparametric population modeling for large data sets
    De Nicolao, Giuseppe
    Pillonetto, Gianluigi
    Chierici, Marco
    Cobelli, Claudio
    2007 AMERICAN CONTROL CONFERENCE, VOLS 1-13, 2007, : 1648 - +
  • [37] An Efficient and Compact Indexing Scheme for Large-scale Data Store
    Lu, Peng
    Wu, Sai
    Shou, Lidan
    Tan, Kian-Lee
    2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 326 - 337
  • [38] A novel data structure for efficient representation of large data sets in data mining
    Pai, Radhika M.
    Ananthanarayana, V. S.
    2006 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATIONS, VOLS 1 AND 2, 2007, : 533 - 538
  • [39] Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-Shaped (RDF) Data
    Tran, Thanh
    Wang, Haofen
    Rudolph, Sebastian
    Cimiano, Philipp
    ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, : 405 - +
  • [40] An Efficient Algorithm for Probabilistic RkNN Query on Uncertain Data with Large k
    Wang, Shengsheng
    Wang, Chuangfeng
    Liu, Wei
    Wang, Qi
    2016 3RD INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE), 2016, : 189 - 193