On the Root Causes of Cross-Application I/O Interference in HPC Storage Systems

被引:69
|
作者
Yildiz, Orcun [1 ]
Dorier, Matthieu [2 ]
Ibrahim, Shadi [1 ]
Ross, Rob [2 ]
Antoniu, Gabriel [1 ]
机构
[1] INRIA Rennes Bretagne Atlant, Rennes, France
[2] Argonne Natl Lab, Argonne, IL 60439 USA
关键词
Exascale I/O; Parallel File Systems; Cross-Application Contention; Interference;
D O I
10.1109/IPDPS.2016.50
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As we move toward the exascale era, performance variability in HPC systems remains a challenge. I/O interference, a major cause of this variability, is becoming more important every day with the growing number of concurrent applications that share larger machines. Earlier research efforts on mitigating I/O interference focus on a single potential cause of interference (e.g., the network). Yet the root causes of I/O interference can be diverse. In this work, we conduct an extensive experimental campaign to explore the various root causes of I/O interference in HPC storage systems. We use microbenchmarks on the Grid' 5000 testbed to evaluate how the applications' access pattern, the network components, the file system's configuration, and the backend storage devices influence I/O interference. Our studies reveal that in many situations interference is a result of bad flow control in the I/O path, rather than being caused by some single bottleneck in one of its components. We further show that interference-free behavior is not necessarily a sign of optimal performance. To the best of our knowledge, our work provides the first deep insight into the role of each of the potential root causes of interference and their interplay. Our findings can help developers and platform owners improve I/O performance and motivate further research addressing the problem across all components of the I/O stack.
引用
收藏
页码:750 / 759
页数:10
相关论文
共 50 条
  • [21] I am, I think I can, and I do: The role of personal identity, self-efficacy, and cross-application of experiences in creativity at work
    Jaussi, Kimberly S.
    Randel, Amy E.
    Dionne, Shelley D.
    CREATIVITY RESEARCH JOURNAL, 2007, 19 (2-3) : 247 - 258
  • [22] Motivation and Implementation of a Dynamic Remote Storage System for I/O Demanding HPC Applications
    Neuer, Matthias
    Salk, Juergen
    Berger, Holger
    Focht, Erich
    Mosch, Christian
    Siegmund, Karsten
    Kushnarenko, Volodymyr
    Kombrink, Stefan
    Wesner, Stefan
    HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2016 INTERNATIONAL WORKSHOPS, 2016, 9945 : 616 - 626
  • [23] A Zoom-in Analysis of I/O Logs to Detect Root Causes of I/O Performance Bottlenecks
    Wang, Teng
    Byna, Suren
    Lockwood, Glenn K.
    Snyder, Shane
    Carns, Philip
    Kim, Sunggon
    Wright, Nicholas J.
    2019 19TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2019, : 102 - 111
  • [24] Understanding HPC Application I/O Behavior Using System Level Statistics
    Paul, Arnab K.
    Faaland, Olaf
    Moody, Adam
    Gonsiorowski, Elsa
    Mohror, Kathryn
    Butt, Ali R.
    2020 IEEE 27TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC 2020), 2020, : 202 - 211
  • [25] Data I/O optimization in storage systems
    Di, W
    Shu, JW
    Shen, MM
    GRID AND COOPERATIVE COMPUTING GCC 2004 WORKSHOPS, PROCEEDINGS, 2004, 3252 : 294 - 302
  • [26] Modeling Power Consumption of Lossy Compressed I/O for Exascale HPC Systems
    Wilkins, Grant
    Calhoun, Jon C.
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2022), 2022, : 1118 - 1126
  • [27] Characterizing Machine Learning I/O Workloads on Leadership Scale HPC Systems
    Paul, Arnab K.
    Karimi, Ahmad Maroof
    Wang, Feiyi
    29TH INTERNATIONAL SYMPOSIUM ON THE MODELING, ANALYSIS, AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS 2021), 2021, : 198 - 205
  • [28] Pinpointing Crash -Consistency Bugs in the HPC I/O Stack: A Cross -Layer Approach
    Sun, Jinghan
    Huang, Jian
    Snir, Marc
    SC21: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2021,
  • [29] Memory Hierarchy Aware I/O Scheduling Under Contention for Hybrid Storage Based HPC
    Zha, Benbo
    Shen, Hong
    2018 9TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES, ALGORITHMS AND PROGRAMMING (PAAP 2018), 2018, : 69 - 73
  • [30] LDMS Darshan Connector: For Run Time Diagnosis of HPC Application I/O Performance
    Walton, Sara
    Aaziz, Omar
    Solorzano, Ana Luisa V.
    Schwaller, Ben
    2022 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2022), 2022, : 626 - 634