Kaskade: Graph Views for Efficient Graph Analytics

被引:7
|
作者
da Trindade, Joana M. F. [1 ]
Karanasos, Konstantinos [2 ]
Curino, Carlo [2 ]
Madden, Samuel [1 ]
Shun, Julian [1 ]
机构
[1] MIT, CSAIL, Cambridge, MA 02139 USA
[2] Microsoft, Albuquerque, NM USA
关键词
D O I
10.1109/ICDE48307.2020.00024
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Graphs are a natural way to model real-world entities and relationships between them, ranging from social networks to data lineage graphs and biological datasets. Queries over these large graphs often involve expensive sub-graph traversals and complex analytical computations. These real-world graphs are often substantially more structured than a generic vertex-and-edge model would suggest, but this insight has remained mostly unexplored by existing graph engines for graph query optimization purposes. In this work, we leverage structural properties of graphs and queries to automatically derive materialized graph views that can dramatically speed up query evaluation. We present KASKADE, the first graph query optimization framework to exploit materialized graph views for query optimization purposes. KASKADE employs a novel constraint-based view enumeration technique that mines constraints from query workloads and graph schemas, and injects them during view enumeration to significantly reduce the search space of views to be considered. Moreover, it introduces a graph view size estimator to pick the most beneficial views to materialize given a query set and to select the best query evaluation plan given a set of materialized views. We evaluate its performance over real-world graphs, including the provenance graph that we maintain at Microsoft to enable auditing, service analytics, and advanced system optimizations. Our results show that KASKADE substantially reduces the effective graph size and yields significant performance speedups (up to 50X), in some cases making otherwise intractable queries possible.
引用
收藏
页码:193 / 204
页数:12
相关论文
共 50 条
  • [1] Energy Efficient Architecture for Graph Analytics Accelerators
    Ozdal, Muhammet Mustafa
    Yesil, Serif
    Kim, Taemin
    Ayupov, Andrey
    Greth, John
    Burns, Steven
    Ozturk, Ozcan
    2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 166 - 177
  • [2] SQL-G: Efficient Graph Analytics by SQL
    Zhao, Kangfei
    Su, Jiao
    Yu, Jeffrey Xu
    Zhang, Hao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (05) : 2237 - 2251
  • [3] Survey on Isomorphic Graph Algorithms for Graph Analytics
    Sangkaran, Theyvaa
    Abdullah, Azween
    JhanJhi, N. Z.
    Supramaniam, Mahadevan
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2019, 19 (01): : 85 - 92
  • [4] An adaptive graph sampling framework for graph analytics
    Wang, Kewen
    SOCIAL NETWORK ANALYSIS AND MINING, 2023, 14 (01)
  • [5] Parallel Graph Analytics
    Lenharth, Andrew
    Nguyen, Donald
    Pingali, Keshav
    COMMUNICATIONS OF THE ACM, 2016, 59 (05) : 78 - 87
  • [6] Distributed Graph Analytics
    Srikant, Y. N.
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY (ICDCIT 2020), 2020, 11969 : 3 - 20
  • [7] The Future of Graph Analytics
    Bonifati, Angela
    Ozsu, M. Tamer
    Tian, Yuanyuan
    Voigt, Hannes
    Yu, Wenyuan
    Zhang, Wenjie
    COMPANION OF THE 2024 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, SIGMOD-COMPANION 2024, 2024, : 544 - 545
  • [8] Architectural Requirements for Energy Efficient Execution of Graph Analytics Applications
    Ozdal, Muhammet Mustafa
    Yesil, Serif
    Kim, Taemin
    Ayupov, Andrey
    Burns, Steven
    Ozturk, Ozcan
    2015 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2015, : 676 - 681
  • [9] LLAMA: Efficient Graph Analytics Using Large Multiversioned Arrays
    Macko, Peter
    Marathe, Virendra J.
    Margo, Daniel W.
    Seltzer, Margo I.
    2015 IEEE 31ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2015, : 363 - 374
  • [10] FAM-Graph: Graph Analytics on Disaggregated Memory
    Zahka, Daniel
    Gavrilovska, Ada
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022), 2022, : 81 - 92