Fingerprinting the Checker Policies of Parallel File Systems

被引:6
|
作者
Han, Runzhou [1 ]
Zhang, Duo [1 ]
Zheng, Mai [1 ]
机构
[1] Iowa State Univ, Ames, IA 50011 USA
关键词
D O I
10.1109/PDSW51947.2020.00013
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Parallel file systems (PFSes) play an essential role in high performance computing. To ensure the integrity, many PFSes are designed with a checker component, which serves as the last line of defense to bring a corrupted PFS back to a healthy state. Motivated by real-world incidents of PFS corruptions, we perform a fine-grained study on the capability of PFS checkers in this paper. We apply type-aware fault injection to specific PFS structures, and examine the detection and repair policies of PFS checkers meticulously via a well-defined taxonomy. The study results on two representative PFS checkers show that they are able to handle a wide range of corruptions on important data structures. On the other hand, neither of them is perfect: there are multiple cases where the checkers may behave sub-optimally, leading to kernel panics, wrong repairs, etc. Our work has led to a new patch on Lustre. We hope to develop our methodology into a generic framework for analyzing the checkers of diverse PFSes, and enable more elegant designs of PFS checkers for reliable high-performance computing.
引用
收藏
页码:46 / 51
页数:6
相关论文
共 50 条
  • [41] Authenticated Key Exchange Protocols for Parallel Network File Systems
    Lim, Hoon Wei
    Yang, Guomin
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (01) : 92 - 105
  • [42] A Performance Study of Lustre File System Checker: Bottlenecks and Potentials
    Dai, Dong
    Gatla, Om Rameshwar
    Zheng, Mai
    2019 35TH SYMPOSIUM ON MASS STORAGE SYSTEMS AND TECHNOLOGIES (MSST 2019), 2019, : 7 - 13
  • [43] ANALYSIS OF OPTIMAL FILE MIGRATION POLICIES IN DISTRIBUTED COMPUTER-SYSTEMS
    SHENG, ORL
    MANAGEMENT SCIENCE, 1992, 38 (04) : 459 - 482
  • [44] Simulation of Client-side Caching Policies for Distributed File Systems
    Bzoch, Pavel
    Safarik, Jiri
    2013 IEEE EUROCON, 2013, : 679 - 686
  • [45] Adaptive file prefetching in parallel file system
    Hwang-Bo, JH
    Lim, JD
    Seo, DW
    PDPTA'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, 2001, : 1919 - 1924
  • [46] Fast probabilistic file fingerprinting for big data
    Konstantin Tretyakov
    Sven Laur
    Geert Smant
    Jaak Vilo
    Pjotr Prins
    BMC Genomics, 14
  • [47] Fast Fingerprinting for File-System Forensics
    Chawathe, Sudarshan S.
    2012 IEEE INTERNATIONAL CONFERENCE ON TECHNOLOGIES FOR HOMELAND SECURITY, 2012, : 591 - 596
  • [48] Fast probabilistic file fingerprinting for big data
    Tretyakov, Konstantin
    Laur, Sven
    Smant, Geert
    Vilo, Jaak
    Prins, Pjotr
    BMC GENOMICS, 2013, 14
  • [49] Evolution of the LMNtal runtime to a parallel model checker
    Gocho, Masato
    Hori, Taisuke
    Ueda, Kazunori
    Computer Software, 2011, 28 (04) : 137 - 157
  • [50] Atomicity Violation Checker for Task Parallel Programs
    Yoga, Adarsh
    Nagarakatte, Santosh
    PROCEEDINGS OF CGO 2016: THE 14TH INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, 2016, : 239 - 249