Case Studies on the Impact and Challenges of Heterogeneous NUMA Architectures for HPC

被引:0
|
作者
Zaourar, Lilia [1 ]
Benazouz, Mohamed [1 ]
Mouhagir, Ayoub [1 ]
Falquez, Carlos [2 ]
Portero, Antoni [2 ]
Ho, Nam [2 ]
Suarez, Estela [2 ]
Petrakis, Polydoros [3 ]
Marazakis, Manolis [3 ]
Sgherzi, Francesco [4 ]
Fernandez, Ivan [4 ]
Dolbeau, Romain [5 ]
Pleiter, Dirk [6 ]
机构
[1] Univ Paris Saclay, List, CEA, F-91120 Palaiseau, France
[2] Forschungszentrum Julich, Inst Adv Simulat, Julich Supercomp Ctr, Julich, Germany
[3] Fdn Res & Technol Hellas FORTH, Inst Comp Sci, Iraklion, Greece
[4] Barcelona Supercomp Ctr BSC, Barcelona, Spain
[5] SiPearl, Rennes, France
[6] KTH Royal Inst Technol, Stockholm, Sweden
关键词
Non-Uniform Memory Access (NUMA); co-design; simulation; High Performance Computing (HPC); benchmarking;
D O I
10.1007/978-3-031-66146-4_17
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The memory systems of High-Performance Computing (HPC) systems commonly feature non-uniform data paths to memory, i.e. are non-uniform memory access (NUMA) architectures. Memory is divided into multiple regions, with each processing unit having its own local memory. Therefore, for each processing unit access to local memory regions is faster compared to accessing memory at non-local regions. Architectures with hybrid memory technologies result in further non-uniformity. This paper presents case studies of the performance potential and data placement implications of non-uniform and heterogeneous memory in HPC systems. Using the gem5 and VPSim simulation platforms, we model NUMA systems with processors based on the ARMv8 Neoverse V1 Reference Design. The gem5 simulator provides a cycle-accurate view, while VPSim offers greater simulation speed, with a high-level view of the simulated system. We highlight the performance impact of design trade-offs regarding NUMA node organization and System Level Cache (SLC) group assignment, as well as Networkon-Chip (NoC) configuration. Our case studies provide essential input to a co-design process involving HPC processor architects and system integrators. A comparison of system configurations for different NoC bandwidths shows reduced NoC latency and high memory bandwidth improvement when NUMA control is enabled. Furthermore, a configuration with HBM2 memory organized as four NUMA nodes highlights the memory bandwidth performance gap and NoC queuing latency impact when comparing local vs. remote memory accesses. On the other hand, NUMA can result in an unbalanced distribution of memory accesses and reduced SLC hit ratios, as shown with DDR4 memory organized as four NUMA nodes.
引用
收藏
页码:251 / 265
页数:15
相关论文
共 50 条
  • [21] Case studies on analyzing software architectures for usability
    Folmer, E
    Bosch, J
    EUROMICRO-SEAA 2005: 31st EUROMICRO Conference on Software Engineering and Advanced Applications, Proceedings, 2005, : 206 - 213
  • [22] Heterogeneous Data-Centric Architectures for Modern Data-Intensive Applications: Case Studies in Machine Learning and Databases
    Oliveira, Geraldo F.
    Boroumand, Amirali
    Ghose, Saugata
    Gomez-Luna, Juan
    Mutlu, Onur
    2022 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2022), 2022, : 273 - 278
  • [23] Case studies of system architectures that use COBOL assets
    Okishima, Haruhiro
    FUJITSU SCIENTIFIC & TECHNICAL JOURNAL, 2006, 42 (03): : 414 - 421
  • [24] A Comprehensive Survey on Mobility Management in 5G Heterogeneous Networks: Architectures, Challenges and Solutions
    Gures, Emre
    Shayea, Ibraheem
    Alhammadi, Abdulraqeb
    Ergen, Mustafa
    Mohamad, Hafizal
    IEEE ACCESS, 2020, 8 : 195883 - 195913
  • [25] A Case Study: Holistic Performance Analysis on Heterogeneous Architectures using the Vampir Toolchain
    Dietrich, Robert
    Winkler, Frank
    William, Thomas
    Stolle, Jonas
    Henschel, Robert
    Berry, Donald K.
    PARALLEL COMPUTING: ACCELERATING COMPUTATIONAL SCIENCE AND ENGINEERING (CSE), 2014, 25 : 793 - 802
  • [26] Treatment challenges in NMOSD with case studies
    Paul, F.
    MULTIPLE SCLEROSIS JOURNAL, 2019, 25 : 24 - 24
  • [27] Software Challenges in Heterogeneous Computing: A Multiple Case Study in Industry
    Andrade, Hugo
    Lwakatare, Lucy Ellen
    Crnkovic, Ivica
    Bosch, Jan
    2019 45TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS (SEAA 2019), 2019, : 148 - 155
  • [28] Networks-on-Chip: Architectures, Design Methodologies, and Case Studies
    Chen, Sao-Jie
    Wu, An-Yeu Andy
    Xu, Jiang
    JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, 2012, 2012
  • [29] Cognitive architectures as Lakatosian research programs: Two case studies
    Cooper, RP
    PHILOSOPHICAL PSYCHOLOGY, 2006, 19 (02) : 199 - 220
  • [30] Performance Impact of a Slower Main Memory: A case study of STT-MRAM in HPC
    Asifuzzaman, Kazi
    Pavlovic, Milan
    Radulovic, Milan
    Zaragoza, David
    Kwon, Ohseong
    Ryoo, Kyung-Chang
    Radojkovic, Petar
    MEMSYS 2016: PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS, 2016, : 40 - 49