Towards I/O analysis of HPC systems and a generic architecture to collect access patterns

被引:11
|
作者
Wiedemann, Marc C. [1 ,2 ]
Kunkel, Julian M. [2 ]
Zimmer, Michaela [2 ]
Ludwig, Thomas [2 ]
Resch, Michael [3 ]
Boenisch, Thomas [3 ]
Wang, Xuan [3 ]
Chut, Andriy [3 ]
Aguilera, Alvaro [4 ]
Nagel, Wolfgang E. [4 ]
Kluge, Michael [4 ]
Mickler, Holger [4 ]
机构
[1] Bundesstr 45a, D-20146 Hamburg, Germany
[2] Univ Hamburg, Deutsch Klimarechenzentrum GmbH, Hamburg, Germany
[3] Univ Stuttgart, High Performance Comp Ctr Stuttgart HLRS, Stuttgart, Germany
[4] Tech Univ Dresden, Zentrum Informationsdienste & Hochleistungsrechne, Dresden, Germany
来源
关键词
I/O analysis; I/O path; Causality tree;
D O I
10.1007/s00450-012-0221-5
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In high-performance computing applications, a high-level I/O call will trigger activities on a multitude of hardware components. These are massively parallel systems supported by huge storage systems and internal software layers. Their complex interplay currently makes it impossible to identify the causes for and the locations of I/O bottlenecks. Existing tools indicate when a bottleneck occurs but provide little guidance in identifying the cause or improving the situation. We have thus initiated Scalable I/O for Extreme Performance to find solutions for this problem. To achieve this goal in SIOX, we will build a system to record access information on all layers and components, to recognize access patterns, and to characterize the I/O system. The system will ultimately be able to recognize the causes of the I/O bottlenecks and propose optimizations for the I/O middleware that can improve I/O performance, such as throughput rate and latency. Furthermore, the SIOX system will be able to support decision making while planning new I/O systems. In this paper, we introduce the SIOX system and describe its current status: We first outline our approach for collecting the required access information. We then provide the architectural concept, the methods for reconstructing the I/O path and an excerpt of the interface for data collection. This paper focuses especially on the architecture, which collects and combines the relevant access information along the I/O path, and which is responsible for the efficient transfer of this information. An abstract modelling approach allows us to better understand the complexity of the analysis of the I/O activities on parallel computing systems, and an abstract interface allows us to adapt the SIOX system to various HPC file systems.
引用
收藏
页码:241 / 251
页数:11
相关论文
共 50 条
  • [41] Can I/O Variability Be Reduced on QoS-Less HPC Storage Systems?
    Huang, Dan
    Liu, Qing
    Choi, Jong
    Podhorszki, Norbert
    Klasky, Scott
    Logan, Jeremy
    Ostrouchov, George
    He, Xubin
    Wolf, Matthew
    IEEE TRANSACTIONS ON COMPUTERS, 2019, 68 (05) : 631 - 645
  • [42] A Comprehensive I/O Knowledge Cycle for Modular and Automated HPC Workload Analysis
    Zhu, Zhaobin
    Neuwirth, Sarah
    Lippert, Thomas
    2022 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2022), 2022, : 581 - 588
  • [43] Methodology to Define a Static Allocation Mapping based on Memory Access Patterns and the Signature of MPI Applications in HPC Systems
    Enrique, Gerard
    Bruballa, Eva
    Suppi, Remo
    Wong, Alvaro
    Luque, Emilio
    Rexachs, Dolores
    JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY, 2024, 24 (02): : 120 - 129
  • [44] Towards Efficient On-demand VM Provisioning: Study of VM Runtime I/O Access Patterns to Shared Image Content
    Kochut, Andrzej
    Karve, Alexei
    Nicolae, Bogdan
    PROCEEDINGS OF THE 2015 IFIP/IEEE INTERNATIONAL SYMPOSIUM ON INTEGRATED NETWORK MANAGEMENT (IM), 2015, : 321 - 329
  • [45] GIMC architecture for linear systems with a single I/O delay
    Xie, Wei
    Li, Guan Lin
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2012, 349 (06): : 2151 - 2162
  • [46] An End-to-end and Adaptive I/O Optimization Tool for Modern HPC Storage Systems
    Yang, Bin
    Zou, Yanliang
    Liu, Weiguo
    Xue, Wei
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022), 2022, : 1294 - 1304
  • [47] Measuring I/O Performance of Lustre and the Temporary File System for Tradespace Applications on HPC Systems
    Kosta, Leonard
    Hunter, Harrison
    George, Glover
    Strelzoff, Andrew
    Matthews, Suzanne J.
    PROCEEDINGS OF THE SOUTHEAST CONFERENCE ACM SE'17, 2017, : 187 - 190
  • [48] CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination
    Dorier, Maahieu
    Antoniu, Gabriel
    Ross, Rob
    Kimpe, Dries
    Ibrahim, Shadi
    2014 IEEE 28TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, 2014,
  • [49] Towards flexible I/O support in parallel and distributed systems
    Matthijs, F
    Berbers, Y
    Joosen, W
    VanOeyen, J
    Robben, B
    Verbaeten, P
    PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS - PROCEEDINGS OF THE ISCA 9TH INTERNATIONAL CONFERENCE, VOLS I AND II, 1996, : 25 - 30
  • [50] Towards Enhanced I/O Performance of NVM File Systems
    Bang, Jiwoo
    Kim, Chungyong
    Byun, Eun-Kyu
    Sung, Hanul
    Lee, Jaehwan
    Eom, Hyeonsang
    2023 IEEE 30TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS, HIPC 2023, 2023, : 319 - 323