ByteGraph: A High-Performance Distributed Graph Database in ByteDance

被引:5
|
作者
Li, Changji [1 ,2 ]
Chen, Hongzhi [2 ]
Zhang, Shuai [2 ]
Hu, Yingqian [2 ]
Chen, Chao [2 ]
Zhang, Zhenjie [2 ]
Li, Meng [2 ]
Li, Xiangchen [2 ]
Han, Dongqing [2 ]
Chen, Xiaohui [2 ]
Wang, Xudong [2 ]
Zhu, Huiming [2 ]
Fu, Xuwei [2 ]
Wu, Tingwei [2 ]
Tan, Hongfei [2 ]
Ding, Hengtian [2 ]
Liu, Mengjin [2 ]
Wang, Kangcheng [2 ]
Ye, Ting [2 ]
Li, Lei [2 ]
Li, Xin [2 ]
Wang, Yu [2 ]
Zheng, Chenguang [1 ,2 ]
Yang, Hao [2 ]
Cheng, James [1 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[2] ByteDance Inc, Beijing, Peoples R China
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2022年 / 15卷 / 12期
关键词
D O I
10.14778/3554821.3554824
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most products at ByteDance, e.g., TikTok, Douyin, and Toutiao, naturally generate massive amounts of graph data. To efficiently store, query and update massive graph data is challenging for the broad range of products at ByteDance with various performance requirements. We categorize graph workloads at ByteDance into three types: online analytical, transaction, and serving processing, where each workload has its own characteristics. Existing graph databases have different performance bottlenecks in handling these workloads and none can efficiently handle the scale of graphs at ByteDance. We developed ByteGraph to process these graph workloads with high throughput, low latency and high scalability. There are several key designs in ByteGraph that make it efficient for processing our workloads, including edge-trees to store adjacency lists for high parallelism and low memory usage, adaptive optimizations on thread pools and indexes, and geographic replications to achieve fault tolerance and availability. ByteGraph has been in production use for several years and its performance has shown to be robust for processing a wide range of graph workloads at ByteDance.
引用
收藏
页码:3306 / 3318
页数:13
相关论文
共 50 条
  • [1] GeaBase: A High-Performance Distributed Graph Database for Industry-Scale Applications
    Fu, Zhisong
    Wu, Zhengwei
    Li, Houyi
    Li, Yize
    Wu, Min
    Chen, Xiaojie
    Ye, Xiaomeng
    Yu, Benquan
    Hu, Xi
    2017 FIFTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2017, : 170 - 175
  • [2] High-Performance with an In-GPU Graph Database Cache
    Morishima, Shin
    Matsutani, Hiroki
    IT PROFESSIONAL, 2017, 19 (06) : 58 - 64
  • [3] Galaxybase: A High Performance Native Distributed Graph Database for HTAP
    Tong, Bing
    Zhou, Yan
    Zhang, Chen
    Tang, Jianheng
    Tang, Jing
    Yang, Leihong
    Li, Qiye
    Lin, Manwu
    Bao, Zhongxin
    Li, Jia
    Chen, Lei
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (12): : 3893 - 3905
  • [4] Weaver: A High-Performance, Transactional Graph Database Based on Refinable Timestamps
    Dubey, Ayush
    Hill, Greg D.
    Escriva, Robert
    Sirer, Emin Gun
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (11): : 852 - 863
  • [5] G-Tran: A High Performance Distributed Graph Database with a Decentralized Architecture
    Chen, Hongzhi
    Li, Changji
    Zheng, Chenguang
    Huang, Chenghuan
    Fang, Juncheng
    Cheng, James
    Zhang, Jian
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (11): : 2545 - 2558
  • [6] A high-performance computing method for data allocation in distributed database systems
    Hababeh, Ismail Omar
    Ramachandran, Muthu
    Bowring, Nicholas
    JOURNAL OF SUPERCOMPUTING, 2007, 39 (01): : 3 - 18
  • [7] A Lightweight Task Graph Scheduler for Distributed High-Performance Scientific Computing
    Weinbub, Josef
    Rupp, Karl
    Selberherr, Siegfried
    APPLIED PARALLEL AND SCIENTIFIC COMPUTING (PARA 2012), 2013, 7782 : 563 - 566
  • [8] A high-performance computing method for data allocation in distributed database systems
    Ismail Omar Hababeh
    Muthu Ramachandran
    Nicholas Bowring
    The Journal of Supercomputing, 2007, 39 : 3 - 18
  • [9] A High-Performance Distributed Relational Database System for Scalable OLAP Processing
    Arnold, Jason
    Glavic, Boris
    Raicu, Ioan
    2019 IEEE 33RD INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2019), 2019, : 738 - 748
  • [10] BG3: A Cost Effective and I/O Efficient Graph Database in ByteDance
    Zhang, Wei
    Chen, Cheng
    Wang, Qiange
    Wang, Wei
    Yang, Shijiao
    Zhou, Bingyu
    Zhu, Huiming
    Chen, Chao
    Zhao, Yongjun
    Hu, Yingqian
    Cheng, Miaomiao
    Li, Meng
    Tan, Hongfei
    Liu, Mengjin
    Lin, Hexiang
    Zhang, Shuai
    Zhang, Lei
    COMPANION OF THE 2024 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, SIGMOD-COMPANION 2024, 2024, : 360 - 372