An I/O Analysis of HPC Workloads on CephFS and Lustre

被引:0
|
作者
Chiusole, Alberto [1 ]
Cozzini, Stefano [1 ,2 ]
van der Ster, Daniel [3 ]
Lamanna, Massimo [3 ]
Giuliani, Graziano [4 ]
机构
[1] eXact Lab Srl, Via Beirut 2, I-34151 Trieste, Italy
[2] SISSA, CNR IOM, Via Bonomea 265, I-34136 Trieste, Italy
[3] CERN, Geneva 23, Switzerland
[4] Abdus Salaam Int Ctr Theoret Phys, Str Costiera 11, I-34151 Trieste, Italy
关键词
Ceph; Lustre; HPC; RegCM; Lazy I/O; Performance;
D O I
10.1007/978-3-030-34356-9_24
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this contribution we compare the performance of the Input/Output load (I/O) of a High-Performance Computing (HPC) application on two different File Systems: CephFS and Lustre; our goal is to assess whether CephFS could be considered a valid choice for intense HPC applications. We perform our analysis using a real HPC workload, namely RegCM, a climate simulation application, and IOR, a synthetic benchmark application, to simulate several I/O patterns using different I/O parallel libraries (MPI-IO, HDF5, PnetCDF). We compare writing performance for the two different I/O approaches that RegCM implements: the so-called spokesperson or serial, and a truly parallel one. The small difference registered between the serial I/O approach and the parallel one motivates us to explore in detail how the software stack interacts with the underlying File Systems. For this reason, we use IOR and MP-IIO hints related to Collective Buffering and Data Sieving to analyze several I/O patterns on the two different File Systems. Finally we investigate Lazy I/O, a unique feature of CephFS, which disables file coherency locks introduced by the File System; this allows Ceph to buffer writes and to fully exploit its parallel and distributed architecture. Two clusters were set up for these benchmarks, one at CNR-IOM and a second one at Pawsey Supercomputing Centre; we performed similar tests on both installations, and we recorded a four-times I/O performance improvement with Lazy I/O enabled. Preliminary results collected so far are quite promising and further actions and new possible I/O optimizations are presented and discussed.
引用
收藏
页码:300 / 316
页数:17
相关论文
共 50 条
  • [1] Characterizing I/O Workloads of HPC Applications Through Online Analysis
    Dong, Wenrui
    Liu, Guangming
    Yu, Jie
    Zuo, You
    2015 IEEE 34TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2015,
  • [2] Extracting and characterizing I/O behavior of HPC workloads
    Devarajan, Hariharan
    Mohror, Kathryn
    2022 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2022), 2022, : 243 - 255
  • [3] Replicating HPC I/O Workloads With Proxy Applications
    Dickson, James
    Wright, Steven
    Maheswaran, Satheesh
    Herdman, Andy
    Miller, Mark C.
    Jarvis, Stephen
    PROCEEDINGS OF PDSW-DISCS 2016 - 1ST JOINT INTERNATIONAL WORKSHOP ON PARALLEL DATA STORAGE AND DATA INTENSIVE SCALABLE COMPUTING SYSTEMS, 2016, : 13 - 18
  • [4] Detecting I/O Access Patterns of HPC Workloads at Runtime
    Bez, Jean Luca
    Boito, Francieli Zanon
    Nou, Ramon
    Miranda, Alberto
    Cortes, Toni
    Navaux, Philippe O. A.
    2019 31ST INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2019), 2019, : 80 - 87
  • [5] Parallel I/O Evaluation Techniques and Emerging HPC Workloads: A Perspective
    Neuwirth, Sarah
    Paul, Arnab K.
    2021 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2021), 2021, : 671 - 679
  • [6] Detection of I/O overloads in local HPC cluster Rocks with LUSTRE file system
    Edison Ramirez, John
    Rojas Cordero, Alexis
    2017 CONGRESO INTERNACIONAL DE INNOVACION Y TENDENCIAS EN INGENIERIA (CONIITI), 2017,
  • [7] Characterizing Machine Learning I/O Workloads on Leadership Scale HPC Systems
    Paul, Arnab K.
    Karimi, Ahmad Maroof
    Wang, Feiyi
    29TH INTERNATIONAL SYMPOSIUM ON THE MODELING, ANALYSIS, AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS 2021), 2021, : 198 - 205
  • [8] Does Varying BeeGFS Configuration Affect the I/O Performance of HPC Workloads?
    Borkar, Arnav
    Tony, Joel
    Vamsi, Hari K. N.
    Barman, Tushar
    Bhisikar, Yash
    Sreenath, T. M.
    Paul, Arnab K.
    2023 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING WORKSHOPS, CLUSTER WORKSHOPS, 2023, : 5 - 7
  • [9] Accelerating I/O performance of ZFS-based Lustre file system in HPC environment
    Bang, Jiwoo
    Kim, Chungyong
    Byun, Eun-Kyu
    Sung, Hanul
    Lee, Jaehwan
    Eom, Hyeonsang
    JOURNAL OF SUPERCOMPUTING, 2023, 79 (07): : 7665 - 7691
  • [10] Accelerating I/O performance of ZFS-based Lustre file system in HPC environment
    Jiwoo Bang
    Chungyong Kim
    Eun-Kyu Byun
    Hanul Sung
    Jaehwan Lee
    Hyeonsang Eom
    The Journal of Supercomputing, 2023, 79 : 7665 - 7691