On the Root Causes of Cross-Application I/O Interference in HPC Storage Systems

被引:69
|
作者
Yildiz, Orcun [1 ]
Dorier, Matthieu [2 ]
Ibrahim, Shadi [1 ]
Ross, Rob [2 ]
Antoniu, Gabriel [1 ]
机构
[1] INRIA Rennes Bretagne Atlant, Rennes, France
[2] Argonne Natl Lab, Argonne, IL 60439 USA
关键词
Exascale I/O; Parallel File Systems; Cross-Application Contention; Interference;
D O I
10.1109/IPDPS.2016.50
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As we move toward the exascale era, performance variability in HPC systems remains a challenge. I/O interference, a major cause of this variability, is becoming more important every day with the growing number of concurrent applications that share larger machines. Earlier research efforts on mitigating I/O interference focus on a single potential cause of interference (e.g., the network). Yet the root causes of I/O interference can be diverse. In this work, we conduct an extensive experimental campaign to explore the various root causes of I/O interference in HPC storage systems. We use microbenchmarks on the Grid' 5000 testbed to evaluate how the applications' access pattern, the network components, the file system's configuration, and the backend storage devices influence I/O interference. Our studies reveal that in many situations interference is a result of bad flow control in the I/O path, rather than being caused by some single bottleneck in one of its components. We further show that interference-free behavior is not necessarily a sign of optimal performance. To the best of our knowledge, our work provides the first deep insight into the role of each of the potential root causes of interference and their interplay. Our findings can help developers and platform owners improve I/O performance and motivate further research addressing the problem across all components of the I/O stack.
引用
收藏
页码:750 / 759
页数:10
相关论文
共 50 条
  • [1] CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination
    Dorier, Maahieu
    Antoniu, Gabriel
    Ross, Rob
    Kimpe, Dries
    Ibrahim, Shadi
    2014 IEEE 28TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, 2014,
  • [2] Evaluation of HPC Application I/O on Object Storage Systems
    Liu, Jialin
    Koziol, Quincey
    Butler, Gregory F.
    Fortner, Neil
    Chaarawi, Mohamad
    Tang, Houjun
    Byna, Suren
    Lockwood, Glenn K.
    Cheema, Ravi
    Kallback-Rose, Kristy A.
    Hazen, Damian
    Prabhat
    PROCEEDINGS OF 2018 IEEE/ACM 3RD JOINT INTERNATIONAL WORKSHOP ON PARALLEL DATA STORAGE & DATA INTENSIVE SCALABLE COMPUTING SYSTEMS (PDSW-DISCS), 2018, : 24 - 34
  • [3] A multivariate and quantitative model for predicting cross-application interference in virtual environments
    Alves, Maicon Melo
    de Assumpcao Drummond, Lucia Maria
    JOURNAL OF SYSTEMS AND SOFTWARE, 2017, 128 : 150 - 163
  • [4] IntP: Quantifying cross-application interference via system-level instrumentation
    Xavier, Miguel G.
    Cano, Carlos H. C.
    Meyer, Vinicius
    De Rose, Cesar A. F.
    2022 IEEE 34TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2022), 2022, : 231 - 240
  • [5] Demystifying asynchronous I/O Interference in HPC applications
    Tseng, Shu-Mei
    Nicolae, Bogdan
    Cappello, Franck
    Chandramowlishwaran, Aparna
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2021, 35 (04): : 391 - 412
  • [6] Can I/O Variability Be Reduced on QoS-Less HPC Storage Systems?
    Huang, Dan
    Liu, Qing
    Choi, Jong
    Podhorszki, Norbert
    Klasky, Scott
    Logan, Jeremy
    Ostrouchov, George
    He, Xubin
    Wolf, Matthew
    IEEE TRANSACTIONS ON COMPUTERS, 2019, 68 (05) : 631 - 645
  • [7] An End-to-end and Adaptive I/O Optimization Tool for Modern HPC Storage Systems
    Yang, Bin
    Zou, Yanliang
    Liu, Weiguo
    Xue, Wei
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022), 2022, : 1294 - 1304
  • [8] Evaluating Asynchronous Parallel I/O on HPC Systems
    Ravi, John
    Byna, Suren
    Koziol, Quincey
    Tang, Houjun
    Becchi, Michela
    2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, IPDPS, 2023, : 211 - 221
  • [9] Scheduling Distributed I/O Resources in HPC Systems
    Bandet, Alexis
    Boito, Francieli
    Pallez, Guillaume
    EURO-PAR 2024: PARALLEL PROCESSING, PT I, EURO-PAR 2024, 2024, 14801 : 137 - 151
  • [10] Automated Modeling of I/O Performance and Interference Effects in Virtualized Storage Systems
    Noorshams, Qais
    Busch, Axel
    Rentschler, Andreas
    Bruhn, Dominik
    Kounev, Samuel
    Tuma, Petr
    Reussner, Ralf
    2014 IEEE 34TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOPS (ICDCSW), 2014, : 88 - 93