FullRepair: Towards Optimal Repair Pipelining in Erasure-Coded Clustered Storage Systems

被引:0
|
作者
Zhang, Yuzuo [1 ]
Tu, Xinyuan [1 ]
Wang, Lin [1 ]
Hu, Yuchong [1 ]
Wang, Fang [1 ]
Wang, Ye [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
来源
2023 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, CLUSTER | 2023年
基金
中国国家自然科学基金;
关键词
distributed storage; erasure coding; parallelism; data reliability; recovery;
D O I
10.1109/CLUSTER52292.2023.00017
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Clustered storage systems often deploy erasure coding that encodes data into coded chunks and distributes them across nodes to tolerate node failures. It is a storage-efficient redundancy scheme but incurs high repair penalty; thus some state-of-the-arts aim to pipeline the above repair process to improve the repair performance. However, we observe that all existing repair pipelining methods only use a single pipeline, making network bandwidth resources of storage nodes underutilized. In this paper, we propose FullRepair, a new repair pipelining mechanism based on multiple pipelines with the aim of fully exploiting all available bandwidth resources during repair. We construct four constraints to model the repair pipelining problem such that we can obtain the optimal pipelined repair throughput under full bandwidth utilization. We design a multi-pipeline scheduling scheme for FullRepair so as to achieve the above optimality. Experiments on the Amazon EC2 show that compared with the state-of-the-art repair pipelining methods RP and PivotRepair, FullRepair reduces the repair time of single chunk by up to 45.40% and 33.19%, respectively.
引用
收藏
页码:107 / 117
页数:11
相关论文
共 50 条
  • [11] CPU: Cross-Rack-Aware Pipelining Update for Erasure-Coded Storage
    Wu, Haiqiao
    Du, Wan
    Gong, Peng
    Wu, Dapeng Oliver
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2022, 10 (04) : 2424 - 2436
  • [12] Parallelized In-Network Aggregation for Failure Repair in Erasure-Coded Storage Systems
    Xia, Junxu
    Luo, Lailong
    Sun, Bowen
    Cheng, Geyao
    Guo, Deke
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, 32 (04) : 2888 - 2903
  • [13] PivotRepair: Fast Pipelined Repair for Erasure-Coded Hot Storage
    Yao, Qiaori
    Hu, Yuchong
    Tu, Xinyuan
    Lee, Patrick P. C.
    Feng, Dan
    Zhu, Xia
    Zhang, Xiaoyang
    Yao, Zhen
    Wei, Wenjia
    2022 IEEE 42ND INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2022), 2022, : 614 - 624
  • [14] Boosting Full-Node Repair in Erasure-Coded Storage
    Lin, Shiyao
    Gong, Guowen
    Shen, Zhirong
    Lee, Patrick P. C.
    Shu, Jiwu
    PROCEEDINGS OF THE 2021 USENIX ANNUAL TECHNICAL CONFERENCE, 2021, : 641 - 655
  • [15] Survey on Data Updating in Erasure-Coded Storage Systems
    Zhang Y.
    Chu J.
    Weng C.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2020, 57 (11): : 2419 - 2431
  • [16] Data Management in Erasure-Coded Distributed Storage Systems
    Aatish, Chiniah
    Avinash, Mungur
    2020 20TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2020), 2020, : 902 - 907
  • [17] Online Encoding for Erasure-Coded Distributed Storage Systems
    Xu, Fangliang
    Wang, Yijie
    Ma, Xingkong
    2017 IEEE 37TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOPS (ICDCSW), 2017, : 338 - 342
  • [18] Modeling and Optimization of Latency in Erasure-coded Storage Systems
    Aggarwal, Vaneet
    Lan, Tian
    FOUNDATIONS AND TRENDS IN COMMUNICATIONS AND INFORMATION THEORY, 2021, 18 (03): : 380 - 525
  • [19] A Rack-Aware Pipeline Repair Scheme for Erasure-Coded Distributed Storage Systems
    Liu, Tong
    Alibhai, Shakeel
    He, Xubin
    PROCEEDINGS OF THE 49TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2020, 2020,
  • [20] Fast Proactive Repair in Erasure-Coded Storage: Analysis, Design, and Implementation
    Li, Xiaolu
    Cheng, Keyun
    Shen, Zhirong
    Lee, Patrick P. C.
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 3400 - 3414