Ganite: A distributed engine for scalable path queries over temporal property graphs

被引:3
|
作者
Ramesh, Shriram [1 ]
Baranawal, Animesh [1 ]
Simmhan, Yogesh [1 ]
机构
[1] Indian Inst Sci, Dept Computat & Data Sci, Bangalore 560012, Karnataka, India
关键词
Graph processing; Temporal graphs; Distributed scheduling; Big data platforms; Query planning;
D O I
10.1016/j.jpdc.2021.02.004
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Property graphs are a common form of linked data, with path queries used to traverse and explore them for enterprise transactions and mining. Temporal property graphs are a recent variant where time is a first-class entity to be queried over, and their properties and structure vary over time. These are seen in social, telecom, transit and epidemic networks. However, current graph databases and query engines have limited support for temporal relations among graph entities, no support for time varying entities and/or do not scale on distributed resources. We address this gap by extending a linear path query model over property graphs to include intuitive temporal predicates and aggregation operators over temporal graphs. We design a distributed execution model for these temporal path queries using the interval-centric computing model, and develop a novel cost model to select an efficient execution plan from several. We perform detailed experiments of our granite distributed query engine using both static and dynamic temporal property graphs as large as 52M vertices, 218M edges and 325M properties, and a 1600-query workload, derived from the LDBC benchmark. We frequently offer sub-second query latencies on a commodity cluster, which is 149x-1140x faster compared to industry-leading Neo4J shared-memory graph database and the JanusGraph/Spark distributed graph query engine. granite also completes 100% of the queries for all graphs, compared to only 32-92% workload completion by the baseline systems. Further, our cost model selects a query plan that is within 10% of the optimal execution time in 90% of the cases. Despite the irregular nature of graph processing, we exhibit a weak-scaling efficiency of >= 60% on 8 nodes and >= 40% on 16 nodes, for most query workloads. (C) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:94 / 111
页数:18
相关论文
共 35 条
  • [21] Optimal Subgraph Matching Queries over Distributed Knowledge Graphs Based on Partial Evaluation
    Xing, Jiao
    Liu, Baozhu
    Li, Jianxin
    Choudhury, Farhana Murtaza
    Wang, Xin
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2021, PT I, 2021, 13080 : 274 - 289
  • [22] Horton+: A Distributed System for Processing Declarative Reachability Queries over Partitioned Graphs
    Sarwat, Mohamed
    Elnikety, Sameh
    He, Yuxiong
    Mokbel, Mohamed F.
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (14): : 1918 - 1929
  • [23] GoDB: From Batch Processing to Distributed Querying over Property Graphs
    Jamadagni, Nitin
    Simmhan, Yogesh
    2016 16TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2016, : 281 - 290
  • [24] Sparklify: A Scalable Software Component for Efficient Evaluation of SPARQL Queries over Distributed RDF Datasets
    Stadler, Claus
    Sejdiu, Gezim
    Graux, Damien
    Lehmann, Jens
    SEMANTIC WEB - ISWC 2019, PT II, 2019, 11779 : 293 - 308
  • [25] An Interval-centric Model for Distributed Computing over Temporal Graphs
    Gandhi, Swapnil
    Simmhan, Yogesh
    2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020), 2020, : 1129 - 1140
  • [26] FlowGraph: Distributed temporal pattern detection over dynamically evolving graphs
    Chaudhry, Hassan Nazeer
    DEBS'19: PROCEEDINGS OF THE 13TH ACM INTERNATIONAL CONFERENCE ON DISTRIBUTED AND EVENT-BASED SYSTEMS, 2019, : 272 - 275
  • [27] Distributed Set Label-Constrained Reachability Queries over Billion-Scale Graphs
    Zeng, Yuanyuan
    Yang, Wangdong
    Zhou, Xu
    Xiao, Guoqing
    Gao, Yunjun
    Li, Kenli
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 1969 - 1981
  • [28] GeoTrie: A Scalable Architecture for Location-Temporal Range Queries over Massive GeoTagged Data Sets
    Cortes, Rudyar
    Bonnaire, Xavier
    Marin, Olivier
    Arantes, Luciana
    Sens, Pierre
    15TH IEEE INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS (IEEE NCA 2016), 2016, : 10 - 17
  • [29] Answering Min-Max Resource-Constrained Shortest Path Queries Over Large Graphs
    Qian, Haoran
    Zheng, Weiguo
    Zhang, Zhijie
    Fu, Bo
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (01) : 60 - 74
  • [30] Distributed Multi-Agent Coverage Path Planning Over Graphs With Relaxed Priority Rule
    Alaviani, Seyyed Shaho
    Velni, Javad Mohammadpour
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (10) : 14462 - 14473