FullRepair: Towards Optimal Repair Pipelining in Erasure-Coded Clustered Storage Systems

被引:0
|
作者
Zhang, Yuzuo [1 ]
Tu, Xinyuan [1 ]
Wang, Lin [1 ]
Hu, Yuchong [1 ]
Wang, Fang [1 ]
Wang, Ye [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
来源
2023 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, CLUSTER | 2023年
基金
中国国家自然科学基金;
关键词
distributed storage; erasure coding; parallelism; data reliability; recovery;
D O I
10.1109/CLUSTER52292.2023.00017
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Clustered storage systems often deploy erasure coding that encodes data into coded chunks and distributes them across nodes to tolerate node failures. It is a storage-efficient redundancy scheme but incurs high repair penalty; thus some state-of-the-arts aim to pipeline the above repair process to improve the repair performance. However, we observe that all existing repair pipelining methods only use a single pipeline, making network bandwidth resources of storage nodes underutilized. In this paper, we propose FullRepair, a new repair pipelining mechanism based on multiple pipelines with the aim of fully exploiting all available bandwidth resources during repair. We construct four constraints to model the repair pipelining problem such that we can obtain the optimal pipelined repair throughput under full bandwidth utilization. We design a multi-pipeline scheduling scheme for FullRepair so as to achieve the above optimality. Experiments on the Amazon EC2 show that compared with the state-of-the-art repair pipelining methods RP and PivotRepair, FullRepair reduces the repair time of single chunk by up to 45.40% and 33.19%, respectively.
引用
收藏
页码:107 / 117
页数:11
相关论文
共 50 条
  • [1] Repair Pipelining for Erasure-Coded Storage
    Li, Runhui
    Li, Xiaolu
    Lee, Patrick P. C.
    Huang, Qun
    2017 USENIX ANNUAL TECHNICAL CONFERENCE (USENIX ATC '17), 2017, : 567 - 579
  • [2] Repair Pipelining for Erasure-coded Storage: Algorithms and Evaluation
    Li, Xiaolu
    Yang, Zuoru
    Li, Jinhong
    Li, Runhui
    Lee, Patrick P. C.
    Huang, Qun
    Hu, Yuchong
    ACM TRANSACTIONS ON STORAGE, 2021, 17 (02)
  • [3] Repair Pipelining for Erasure-Coded Storage Based on Load-Balanced
    Jiang X.-Y.
    Li G.-Y.
    Zhou Y.
    Hu J.-P.
    Li H.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2020, 48 (05): : 930 - 936
  • [4] Fast Predictive Repair in Erasure-Coded Storage
    Shen, Zhirong
    Li, Xiaolu
    Lee, Patrick P. C.
    2019 49TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN 2019), 2019, : 556 - 567
  • [5] Fast Repair for Single Failure in Erasure-coded Distributed Storage Systems
    Zhang, Huayu
    Li, Hui
    Zhu, Bing
    Chen, Jun
    2014 IEEE 33RD INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS (SRDS), 2014, : 146 - 151
  • [6] SelectiveEC: Towards Balanced Recovery Load on Erasure-Coded Storage Systems
    Xu, Liangliang
    Lyu, Min
    Li, Qiliang
    Xie, Lingjiang
    Li, Cheng
    Xu, Yinlong
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (10) : 2386 - 2400
  • [7] Rack Aware Data Placement for Network Consumption in Erasure-Coded Clustered Storage Systems
    Shao, Bilin
    Song, Dan
    Bian, Genqing
    Zhao, Yu
    INFORMATION, 2018, 9 (07)
  • [8] Optimal resilience for erasure-coded Byzantine distributed storage
    Cachin, Christian
    Tessaro, Stefano
    DSN 2006 INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2006, : 115 - 124
  • [9] Repair Tree: Fast Repair for Single Failure in Erasure-Coded Distributed Storage Systems
    Zhang, Huayu
    Li, Hui
    Li, Shuo-Yen Robert
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (06) : 1728 - 1739
  • [10] Optimal resilience for erasure-coded Byzantine distributed storage
    Cachin, C
    Tessaro, S
    DISTRIBUTED COMPUTING, PROCEEDINGS, 2005, 3724 : 497 - 498