A Focused Garbage Collection Approach for Primary Deduplicated Storage with Low Memory Overhead

被引:1
|
作者
Yuan, Jingsong [1 ]
Zou, Xiangyu [1 ]
Xu, Han [1 ]
Cao, Zhichao [2 ]
Li, Shiyi [1 ]
Xia, Wen [1 ,4 ]
Wang, Peng [3 ]
Chen, Li [3 ]
机构
[1] Harbin Inst Technol, Shenzhen, Peoples R China
[2] Arizona State Univ, Tempe, AZ 85287 USA
[3] Huawei Technol Co Ltd, Shenzhen, Peoples R China
[4] Guangdong Prov Key Lab Novel Secur Intelligence T, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
Deduplication; Garbage Collection; Overhead;
D O I
10.1109/ICCD56317.2022.00053
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Since one chunk could be shared by many files after data deduplication, Garbage Collection (GC) is an essential but complex task to reclaim stale chunks in large-scale primary deduplication systems. Traditional Mark&Sweep is a widely used approach but suffers from the increasingly traversing time and huge memory overhead of Liveness Array (i.e., a data structure reflects the liveness of alive chunks) in the Mark phase. This paper proposes a new method named Focused Garbage Collection (FGC) to accelerate the Mark phase for primary deduplication storage significantly. Specifically, we design a global Austere Reference Graph with low memory cost that efficiently represents files' reference relationships (i.e., sharing chunks after deduplication) by considering the deduplication characteristics of workloads in primary systems. Austere Reference Graph helps FGC focus on the deleted files and their correlative files to quickly mark stale chunks, while traditional approaches need to traverse all files. Consequently, FGC's traversing time and Liveness Array size will be greatly reduced in the Mark phase. Evaluation results show that compared with traditional Mark&Sweep, FGC decreases the time consumption in the Mark phase 1.3x-7.34x in a stand-alone primary deduplication system and 128x-256x network traffic reduction for the Mark phase while only introducing < 0.05% extra memory overhead for the reference graph.
引用
收藏
页码:315 / 323
页数:9
相关论文
共 16 条
  • [1] Decreasing memory overhead in hard real-time garbage collection
    Ritzau, T
    Fritzson, P
    EMBEDDED SOFTWARE, PROCEEDINGS, 2002, 2491 : 213 - 226
  • [2] Mark-Sharing: A Parallel Garbage Collection Algorithm for Low Synchronization Overhead
    Park, Hyunkyu
    Lee, Changmin
    Kim, Seung Hun
    Ro, Won Woo
    Gaudiot, Jean-Luc
    2013 19TH IEEE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS 2013), 2013, : 18 - 25
  • [3] Remote reference counting: Distributed garbage collection with low communication and computation overhead
    Kogan, D
    Schuster, A
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2000, 60 (10) : 1260 - 1292
  • [4] Ensuring high reliability and performance with low space overhead for deduplicated and delta-compressed storage systems
    Zuo, Chunxue
    Wang, Fang
    Zheng, Mai
    Hu, Yuchong
    Feng, Dan
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (05):
  • [5] FeGC: An efficient garbage collection scheme for flash memory based storage systems
    Kwon, Ohhoon
    Koh, Kern
    Lee, Jaewoo
    Bahn, Hyokyung
    JOURNAL OF SYSTEMS AND SOFTWARE, 2011, 84 (09) : 1507 - 1523
  • [6] Garbage Collection for Low Performance Variation in NAND Flash Storage Systems
    Jung, Sanghyuk
    Song, Yong Ho
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2015, 34 (01) : 16 - 28
  • [7] Hard real-time hybrid garbage collection with low memory requirements
    Chang, Yang
    Wellings, Andy
    27TH IEEE INTERNATIONAL REAL-TIME SYSTEMS SYMPOSIUM, PROCEEDINGS, 2006, : 77 - +
  • [8] Adaptive garbage collection mechanism for N-log block flash memory storage systems
    Du, Yehua
    Cai, Ming
    Dong, Jinxiang
    ICAT 2006: 16TH INTERNATIONAL CONFERENCE ON ARTIFICIAL REALITY AND TELEXISTENCE - WORSHOPS, PROCEEDINGS, 2006, : 532 - 535
  • [9] EaD: ECC-Assisted Deduplication With High Performance and Low Memory Overhead for Ultra-Low Latency Flash Storage
    Wu, Suzhen
    Du, Chunfeng
    Zhu, Weidong
    Zhou, Jindong
    Jiang, Hong
    Mao, Bo
    Zeng, Lingfang
    IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (01) : 208 - 221
  • [10] Shasta: a low overhead, software-only approach for supporting fine-grain shared memory
    Digital Equipment Corp
    Comput Archit News, Special Issu (174-185):