RepBun: Load-Balanced, Shuffle-Free Cluster Caching for Structured Data

被引:0
|
作者
Yu, Minchen [1 ]
Yu, Yinghao [1 ]
Zheng, Yunchuan [1 ]
Yang, Baichen [1 ]
Wang, Wei [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
关键词
D O I
10.1109/infocom41043.2020.9155409
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Cluster caching systems increasingly store structured data objects in the columnar format. However, these systems routinely face the imbalanced load that significantly impairs the I/O performance. Existing load-balancing solutions, while effective for reading unstructured data objects, fall short in handling columnar data. Unlike unstructured data that can only be read through a full-object scan, columnar data supports direct query of specific columns with two distinct access patterns: (1) columns have the heavily skewed popularity, and (2) hot columns are likely accessed together in a query job. Based on these two access patterns, we propose an effective load-balancing solution for structured data. Our solution, which we call RepBun, groups hot columns into a bundle. It then copies multiple replicas of the column bundle and stores them uniformly across servers. We show that RepBun achieves improved load balancing with reduced memory overhead, while avoiding data shuffling between cache servers. We implemented RepBun atop Alluxio, a popular in-memory distributed storage, and evaluate its performance through EC2 deployment against the TPC-H benchmark workload. Experimental results show that RepBun outperforms the existing load-balancing solutions with significantly shorter read latency and faster query completion.
引用
收藏
页码:954 / 963
页数:10
相关论文
共 44 条
  • [1] Achieving Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition
    Yu, Yinghao
    Wang, Wei
    Huang, Renfei
    Zhang, Jun
    Ben Letaief, Khaled
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (02) : 439 - 454
  • [2] SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition
    Yu, Yinghao
    Huang, Renfei
    Wang, Wei
    Zhang, Jun
    Ben Letaief, Khaled
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE, AND ANALYSIS (SC'18), 2018,
  • [3] EC-Cache: Load-balanced, Low-latency Cluster Caching with Online Erasure Coding
    Rashmi, K., V
    Chowdhury, Mosharaf
    Kosaian, Jack
    Stoica, Ion
    Ramchandran, Kannan
    PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, 2016, : 401 - 417
  • [4] Scalable and Load-balanced Data Center Multicast
    Cui, Wenzhi
    Qian, Chen
    2015 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2015,
  • [5] Load-Balanced Cluster for Scale-Out Storage of Knowledge
    Xiong, Zheng
    Zhu, Guocheng
    Yu, Wei
    Wang, Sen
    Chong, Zhihong
    2018 SIXTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2018, : 1 - 5
  • [6] Load-Balanced Data Collection through Opportunistic Routing
    Michel, Mathieu
    Duquennoy, Simon
    Quoitin, Bruno
    Voigt, Thiemo
    2015 INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING IN SENSOR SYSTEMS (DCOSS), 2015, : 62 - 70
  • [7] Load-balanced data layout approach in data-intensive computing
    Song, J. (songjie@mail.neu.edu.cn), 1600, Beijing University of Posts and Telecommunications (36):
  • [8] Dynamic Data Repartitioning for Load-Balanced Parallel Particle Tracing
    Zhang, Jiang
    Guo, Hanqi
    Yuan, Xiaoru
    Peterka, Tom
    2018 IEEE PACIFIC VISUALIZATION SYMPOSIUM (PACIFICVIS), 2018, : 86 - 95
  • [9] Perfectly Load-Balanced, Stable, Synchronization-Free Parallel Merge
    Siebert, Christian
    Traeff, Jesper Larsson
    PARALLEL PROCESSING LETTERS, 2014, 24 (01)
  • [10] LBB: load-balanced batching for efficient distributed learning on heterogeneous GPU cluster
    Yao, Feixiang
    Zhang, Zhonghao
    Ji, Zeyu
    Liu, Bin
    Gao, Haoyuan
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (09): : 12247 - 12272