Distributed Efficient Provenance-Aware Regular Path Queries on Large RDF Graphs

被引:4
|
作者
Xin, Yueqi [1 ,2 ]
Wang, Xin [1 ,2 ]
Jin, Di [1 ,2 ]
Wang, Simiao [1 ,2 ]
机构
[1] Tianjin Univ, Sch Comp Sci & Technol, Tianjin, Peoples R China
[2] Tianjin Key Lab Cognit Comp & Applicat, Tianjin, Peoples R China
基金
中国国家自然科学基金;
关键词
Regular path query; Provenance-aware; RDF graph Pregel;
D O I
10.1007/978-3-319-91452-7_49
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the proliferation of knowledge graphs, massive RDF graphs have been published on the Web. As an essential type of queries for RDF graphs, Regular Path Queries (RPQs) have been attracting increasing research efforts. However, the existing query processing approaches mainly focus on the standard semantics of RPQs, which cannot provide provenance of the answer sets. We propose dProvRPQ that is a distributed approach to evaluating provenance-aware RPQs over big RDF graphs. Our Pregel-based method employs Glushkov automata to keep track of matching processes of RPQs in parallel. Meanwhile, four optimization strategies are devised, including edge filtering, candidate states, message compression, and message selection, which can reduce the intermediate results of the basic dProvRPQ algorithm dramatically and overcome the counting-paths problem to some extent. The proposed algorithms are verified by extensive experiments on both synthetic and real-world datasets, which show that our approach can efficiently answer the provenance-aware RPQs over large RDF graphs.
引用
收藏
页码:766 / 782
页数:17
相关论文
共 50 条
  • [41] Longest Path Subgraph: A Novel and Efficient Algorithm to Match RDF Graphs
    Gutierrez-Soto, Claudio
    Campos, Pedro G.
    Aguila, Julio
    NINTH MEXICAN INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE, PROCEEDINGS, 2008, : 232 - +
  • [42] RDFPath: Path Query Processing on Large RDF Graphs with Map Reduce
    Przyjaciel-Zablocki, Martin
    Schaetzle, Alexander
    Hornung, Thomas
    Lausen, Georg
    SEMANTIC WEB: ESWC 2011 WORKSHOPS, 2012, 7117 : 50 - 64
  • [43] Efficient Subgraph Matching on Large RDF Graphs Using MapReduce
    Wang, Xin
    Chai, Lele
    Xu, Qiang
    Yang, Yajun
    Li, Jianxin
    Wang, Junhu
    Chai, Yunpeng
    DATA SCIENCE AND ENGINEERING, 2019, 4 (01) : 24 - 43
  • [44] Efficient Subgraph Matching on Large RDF Graphs Using MapReduce
    Xin Wang
    Lele Chai
    Qiang Xu
    Yajun Yang
    Jianxin Li
    Junhu Wang
    Yunpeng Chai
    Data Science and Engineering, 2019, 4 : 24 - 43
  • [45] BL: An Efficient Index for Reachability Queries on Large Graphs
    Yu, Changyong
    Ren, Tianmei
    Li, Wenyu
    Liu, Huimin
    Ma, Haitao
    Zhao, Yuhai
    IEEE TRANSACTIONS ON BIG DATA, 2024, 10 (02) : 108 - 121
  • [46] Efficient Regular Simple Path Queries under Transitive Restricted Expressions
    Liang, Qi
    Ouyang, Dian
    Zhang, Fan
    Yang, Jianye
    Lin, Xuemin
    Tian, Zhihong
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (07): : 1710 - 1722
  • [47] Efficiently Answering Regular Simple Path Queries on Large Labeled Networks
    Wadhwa, Sarisht
    Prasad, Anagh
    Ranu, Sayan
    Bagchi, Amitabha
    Bedathur, Srikanta
    SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 1463 - 1480
  • [48] CORE Analysis for Efficient Shortest Path Traversal Queries in Social Graphs
    Nawaz, Waqas
    Khan, Kifayat-Ullah
    Lee, Young-Koo
    2014 IEEE FOURTH INTERNATIONAL CONFERENCE ON BIG DATA AND CLOUD COMPUTING (BDCLOUD), 2014, : 363 - 370
  • [49] Fast Shortest-Path Queries on Large-Scale Graphs
    Xu, Qiongwen
    Zhang, Xu
    Zhao, Jin
    Wang, Xin
    Wolf, Tilman
    2016 IEEE 24TH INTERNATIONAL CONFERENCE ON NETWORK PROTOCOLS (ICNP), 2016,
  • [50] Ganite: A distributed engine for scalable path queries over temporal property graphs
    Ramesh, Shriram
    Baranawal, Animesh
    Simmhan, Yogesh
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2021, 151 : 94 - 111