Efficient Graph Query Processing over Geo-Distributed Datacenters

被引:8
|
作者
Yuan, Ye [1 ]
Ma, Delong [2 ]
Wen, Zhenyu [3 ]
Ma, Yuliang [2 ]
Wang, Guoren [1 ]
Chen, Lei [4 ]
机构
[1] Beijing Inst Technol, Beijing, Peoples R China
[2] Northeastern Univ, Shenyang, Peoples R China
[3] Newcastle Univ, Newcastle Upon Tyne, Tyne & Wear, England
[4] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
关键词
Graph search; Geo-distributed; Datacenters; MAPREDUCE;
D O I
10.1145/3397271.3401157
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Graph queries have emerged as one of the fundamental techniques to support modern search services, such as PageRank web search, social networking search and knowledge graph search. As such graphs are maintained globally and very huge (e.g., billions of nodes), we need to efficiently process graph queries across multiple geographically distributed datacenters, running geo-distributed graph queries. Existing graph computing frameworks may not work well for geographically distributed datacenters, because they implement a Bulk Synchronous Parallel model that requires excessive inter-datacenter transfers, thereby introducing extremely large latency for query processing. In this paper, we propose GeoGraph-a universal framework to support efficient geo-distributed graph query processing based on clustering datacenters and meta-graph, while reducing the inter-datacenter communication. Our new framework can be applied to many types of graph algorithms without any modification. The framework is developed on the top of Apache Giraph. The experiments were conducted by applying four important graph queries, i.e., shortest path, graph keyword search, subgraph isomorphism and PageRank. The evaluation results show that our proposed framework can achieve up to 82% faster convergence, 42% lower WAN bandwidth usage, and 45% less total monetary cost for the four graph queries, with input graphs stored across ten geo-distributed datacenters.
引用
收藏
页码:619 / 628
页数:10
相关论文
共 50 条
  • [41] Truthful auction mechanisms for VNF chain provisioning and allocation across geo-distributed datacenters
    Wang, Xueyi
    Wang, Xingwei
    Wu, Dongkuo
    Ma, Lianbo
    Huang, Min
    Computer Networks, 2022, 217
  • [42] Uncertainty Level-Based Algorithms by Managing Renewable Energy for Geo-Distributed Datacenters
    Padhi, Slokashree
    Subramanyam, R. B. V.
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (04): : 5337 - 5354
  • [43] Sketch-based Data Placement among Geo-distributed Datacenters for Cloud Storages
    Yu, Boyang
    Pan, Jianping
    IEEE INFOCOM 2016 - THE 35TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS, 2016,
  • [44] OneEdge: An Efficient Control Plane for Geo-Distributed Infrastructures
    Saurez, Enrique
    Gupta, Harshit
    Daglis, Alexandros
    Ramachandran, Umakishore
    PROCEEDINGS OF THE 2021 ACM SYMPOSIUM ON CLOUD COMPUTING (SOCC '21), 2021, : 182 - 196
  • [45] Scheduling Stream Processing Tasks on Geo-Distributed Heterogeneous Resources
    Janssen, Gerrit
    Verbitskiy, Ilya
    Renner, Thomas
    Thamsen, Lauritz
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 5159 - 5164
  • [46] Efficient Distributed Query Processing on Large Scale RDF Graph Data
    Wang X.
    Xu Q.
    Chai L.-L.
    Yang Y.-J.
    Chai Y.-P.
    Ruan Jian Xue Bao/Journal of Software, 2019, 30 (03): : 498 - 514
  • [47] MIN-Max-Min: A Heuristic Scheduling Algorithm for Jobs Across Geo-distributed Datacenters
    Li, Yan
    Zhu, Chunge
    Wang, Yong
    2018 IEEE 38TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2018, : 1573 - 1574
  • [48] GreenBDT: Renewable-aware scheduling of bulk data transfers for geo-distributed sustainable datacenters
    Lu, Xingjian
    Jiang, Dongxu
    He, Gaoqi
    Yu, Huiqun
    SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2018, 20 : 120 - 129
  • [49] Optimized Provisioning of SDN-enabled Virtual Networks in Geo-distributed Cloud Computing Datacenters
    Alhazmi, Khaled
    Shami, Abdallah
    Refaey, Ahmed
    JOURNAL OF COMMUNICATIONS AND NETWORKS, 2017, 19 (04) : 402 - 415
  • [50] Efficient Distributed Query Processing
    Kolcun, Roman
    Boyle, David E.
    McCann, Julie A.
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2016, 13 (03) : 1230 - 1246