SWeG: Lossless and Lossy Summarization of Web-Scale Graphs

被引:26
|
作者
Shin, Kijung [1 ]
Ghoting, Amol [2 ]
Kim, Myunghwan [2 ]
Raghavan, Hema [2 ]
机构
[1] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon, South Korea
[2] LinkedIn Corp, Mountain View, CA USA
关键词
D O I
10.1145/3308558.3313402
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Given a terabyte-scale graph distributed across multiple machines, how can we summarize it, with much fewer nodes and edges, so that we can restore the original graph exactly or within error bounds? As large-scale graphs are ubiquitous, ranging from web graphs to online social networks, compactly representing graphs becomes important to efficiently store and process them. Given a graph, graph summarization aims to find its compact representation consisting of (a) a summary graph where the nodes are disjoint sets of nodes in the input graph, and each edge indicates the edges between all pairs of nodes in the two sets; and (b) edge corrections for restoring the input graph from the summary graph exactly or within error bounds. Although graph summarization is a widely-used graph-compression technique readily combinable with other techniques, existing algorithms for graph summarization are not satisfactory in terms of speed or compactness of outputs. More importantly, they assume that the input graph is small enough to fit in main memory. In this work, we propose SWeG, a fast parallel algorithm for summarizing graphs with compact representations. SWeG is designed for not only shared-memory but also MapReduce settings to summarize graphs that are too large to fit in main memory. We demonstrate that SWeG is (a) Fast: SWeG is up to 5400x faster than its competitors that give similarly compact representations, (b) Scalable: SWeG scales to graphs with tens of billions of edges, and (c) Compact: combined with state-of-the-art compression methods, SWeG achieves up to 3.4x better compression than them.
引用
收藏
页码:1679 / 1690
页数:12
相关论文
共 50 条
  • [1] Constructing and Mining Web-Scale Knowledge Graphs
    Gabrilovich, Evgeniy
    Usunier, Nicolas
    SIGIR'16: PROCEEDINGS OF THE 39TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2016, : 1195 - 1197
  • [2] Constructing and Mining Web-Scale Knowledge Graphs
    Bordes, Antoine
    Gabrilovich, Evgeniy
    PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, : 1967 - 1967
  • [3] Enabling Web-Scale Knowledge Graphs Querying
    Azzam, Amr
    SEMANTIC WEB: ESWC 2020 SATELLITE EVENTS, 2020, 12124 : 229 - 239
  • [4] ZOOMER: Boosting Retrieval on Web-scale Graphs by Regions of Interest
    Jiang, Yuezihan
    Cheng, Yu
    Zhao, Hanyu
    Zhang, Wentao
    Miao, Xupeng
    He, Yu
    Wang, Liang
    Yang, Zhi
    Cui, Bin
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 2224 - 2236
  • [5] Leveraging Knowledge Graphs of Movies and their Content for Web-Scale Analysis
    Orlandi, Fabrizio
    Debattista, Jeremy
    Hassan, Islam A.
    Conran, Clare
    Latifi, Majid
    Nicholson, Matthew
    Salim, Fahim A.
    Turner, Daniel
    Conlan, Owen
    O'Sullivan, Declan
    Tang, Jian
    2018 14TH INTERNATIONAL CONFERENCE ON SIGNAL IMAGE TECHNOLOGY & INTERNET BASED SYSTEMS (SITIS), 2018, : 609 - 616
  • [6] Leveraging Knowledge Graphs for Web-Scale Unsupervised Semantic Parsing
    Heck, Larry
    Hakkani-Tur, Dilek
    Tur, Gokhan
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1593 - 1597
  • [7] Web-Scale Datacenters
    Douglis, Fred
    IEEE INTERNET COMPUTING, 2014, 18 (04) : 13 - 14
  • [8] MARES: multitask learning algorithm for Web-scale real-time event summarization
    Min Yang
    Wenting Tu
    Qiang Qu
    Kai Lei
    Xiaojun Chen
    Jia Zhu
    Ying Shen
    World Wide Web, 2019, 22 : 499 - 515
  • [9] Constructing and Mining Web-Scale Knowledge Graphs WWW 2015 Tutorial
    Bordes, Antoine
    Gabrilovich, Evgeniy
    WWW'15 COMPANION: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2015, : 1523 - 1523
  • [10] Bermuda: An Efficient MapReduce Triangle Listing Algorithm for Web-Scale Graphs
    Xiao, Dongqing
    Eltabakh, Mohamed
    Kong, Xiangnan
    28TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT (SSDBM) 2016), 2016,