On the Root Causes of Cross-Application I/O Interference in HPC Storage Systems

被引:69
|
作者
Yildiz, Orcun [1 ]
Dorier, Matthieu [2 ]
Ibrahim, Shadi [1 ]
Ross, Rob [2 ]
Antoniu, Gabriel [1 ]
机构
[1] INRIA Rennes Bretagne Atlant, Rennes, France
[2] Argonne Natl Lab, Argonne, IL 60439 USA
关键词
Exascale I/O; Parallel File Systems; Cross-Application Contention; Interference;
D O I
10.1109/IPDPS.2016.50
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As we move toward the exascale era, performance variability in HPC systems remains a challenge. I/O interference, a major cause of this variability, is becoming more important every day with the growing number of concurrent applications that share larger machines. Earlier research efforts on mitigating I/O interference focus on a single potential cause of interference (e.g., the network). Yet the root causes of I/O interference can be diverse. In this work, we conduct an extensive experimental campaign to explore the various root causes of I/O interference in HPC storage systems. We use microbenchmarks on the Grid' 5000 testbed to evaluate how the applications' access pattern, the network components, the file system's configuration, and the backend storage devices influence I/O interference. Our studies reveal that in many situations interference is a result of bad flow control in the I/O path, rather than being caused by some single bottleneck in one of its components. We further show that interference-free behavior is not necessarily a sign of optimal performance. To the best of our knowledge, our work provides the first deep insight into the role of each of the potential root causes of interference and their interplay. Our findings can help developers and platform owners improve I/O performance and motivate further research addressing the problem across all components of the I/O stack.
引用
收藏
页码:750 / 759
页数:10
相关论文
共 50 条
  • [41] Reducing I/O variability using dynamic I/O path characterization in petascale storage systems
    Son, Seung Woo
    Sehrish, Saba
    Liao, Wei-keng
    Oldfield, Ron
    Choudhary, Alok
    JOURNAL OF SUPERCOMPUTING, 2017, 73 (05): : 2069 - 2097
  • [42] Multi-Root I/O Virtualization Based Redundant Systems
    Xu, Sendren Sheng-Dong
    Wang, Chia-Hong
    Chang, Teng-Chang
    Su, Shun-Feng
    2014 JOINT 7TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 15TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2014, : 1302 - 1305
  • [43] Reducing I/O variability using dynamic I/O path characterization in petascale storage systems
    Seung Woo Son
    Saba Sehrish
    Wei-keng Liao
    Ron Oldfield
    Alok Choudhary
    The Journal of Supercomputing, 2017, 73 : 2069 - 2097
  • [44] Gauge: An Interactive Data-Driven Visualization Tool for HPC Application I/O Performance Analysis
    del Rosario, Eliakin
    Currier, Mikaela
    Isakov, Mihailo
    Madireddy, Sandeep
    Balaprakash, Prasanna
    Carns, Philip
    Ross, Robert B.
    Harms, Kevin
    Snyder, Shane
    Kinsy, Michel A.
    PROCEEDINGS OF 2020 IEEE/ACM FIFTH INTERNATIONAL PARALLEL DATA SYSTEMS WORKSHOP (PDSW 2020), 2020, : 15 - 21
  • [45] Entropy-Aware I/O Pipelining for Large-Scale Deep Learning on HPC Systems
    Zhu, Yue
    Chowdhury, Fahim
    Fu, Huansong
    Moody, Adam
    Mohror, Kathryn
    Sato, Kento
    Yu, Weikuan
    2018 IEEE 26TH INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS, AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS), 2018, : 145 - 156
  • [46] SchedP: I/O-aware Job Scheduling in Large-Scale Production HPC Systems
    Wu, Kaiyue
    Wei, Jianwen
    Lin, James
    NETWORK AND PARALLEL COMPUTING, NPC 2022, 2022, 13615 : 315 - 326
  • [47] PROV-IO: An I/O-Centric Provenance Framework for Scientific Data on HPC Systems
    Han, Runzhou
    Byna, Suren
    Tang, Houjun
    Dong, Bin
    Zheng, Mai
    PROCEEDINGS OF THE 31ST INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, HPDC 2022, 2022, : 213 - 226
  • [48] An I/O subsystem supporting mass storage functions in parallel systems
    Catania, V
    Puliafito, A
    Riccobene, S
    Vita, L
    COMPUTER STANDARDS & INTERFACES, 1996, 18 (02) : 117 - 138
  • [49] Optimizing I/O Operations in File Systems for Fast Storage Devices
    Son, Yongseok
    Yeom, Heon Young
    Han, Hyuck
    IEEE TRANSACTIONS ON COMPUTERS, 2017, 66 (06) : 1071 - 1084
  • [50] Applying Selectively Parallel I/O Compression to Parallel Storage Systems
    Filgueira, Rosa
    Atkinson, Malcolm
    Tanimura, Yusuke
    Kojima, Isao
    EURO-PAR 2014 PARALLEL PROCESSING, 2014, 8632 : 282 - 293