Benchmarking multi-GPU applications on modern multi-GPU integrated systems

被引:1
|
作者
Bernaschi, Massimo [1 ]
Agostini, Elena [2 ]
Rossetti, Davide [2 ]
机构
[1] CNR, I-00185 Rome, Italy
[2] NVIDIA, Santa Clara, CA USA
来源
关键词
approximate inverse; DGX-1; GPUDirec; POWER9; spin; LINEAR-SYSTEMS;
D O I
10.1002/cpe.5470
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
GPUs are very powerful computing accelerators that are often employed in single-device configuration. However, there is a steadily growing interest in using multiple GPUs in a concurrent way both to overcome the memory limitations of the single device and to further reduce execution times. Until recently, communication among GPUs had been carried out mainly by using networking technologies originally devised for standard CPUs with the CPU playing an active role in the communication. However, new alternatives start to be available in which a moderate number of GPUs are directly connected each other by means of proprietary technologies. We present the results of a set of experiments aimed at assessing the performance of some of these hardware/software platforms using a particularly challenging application as a benchmark. We release its source code to facilitate people interested in reproducing or extending our results.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Dynamic load balancing on heterogeneous multi-GPU systems
    Acosta, Alejandro
    Blanco, Vicente
    Almeida, Francisco
    COMPUTERS & ELECTRICAL ENGINEERING, 2013, 39 (08) : 2591 - 2602
  • [32] Moim: A Multi-GPU MapReduce Framework
    Xie, Mengjun
    Kang, Kyoung-Don
    Basaran, Can
    2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 1279 - 1286
  • [33] An introduction to multi-GPU programming for physicists
    M. Bernaschi
    M. Bisson
    M. Fatica
    E. Phillips
    The European Physical Journal Special Topics, 2012, 210 : 17 - 31
  • [34] Efficient Implementation of MrBayes on Multi-GPU
    Bao, Jie
    Xia, Hongju
    Zhou, Jianfu
    Liu, Xiaoguang
    Wang, Gang
    MOLECULAR BIOLOGY AND EVOLUTION, 2013, 30 (06) : 1471 - 1479
  • [35] Tensor Movement Orchestration in Multi-GPU Training Systems
    Lin, Shao-Fu
    Chen, Yi-Jung
    Cheng, Hsiang-Yun
    Yang, Chia-Lin
    2023 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA, 2023, : 1140 - 1152
  • [36] Gossip: Efficient Communication Primitives for Multi-GPU Systems
    Kobus, Robin
    Juenger, Daniel
    Hundt, Christian
    Schmidt, Bertil
    PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,
  • [37] Solving Multiple Tridiagonal Systems on a Multi-GPU Platform
    Dieguez, Adrian P.
    Amor, Margarita
    Doallo, Ramon
    2018 26TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2018), 2018, : 759 - 763
  • [38] HPSM: A Programming Framework for Multi-CPU and Multi-GPU Systems
    Lima, Joao V. F.
    Di Domenico, Daniel
    2017 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING WORKSHOPS (SBAC-PADW), 2017, : 31 - 36
  • [39] Multi-Objective Concurrent Kernel Scheduling for Multi-GPU Systems
    Alizadeh, Negar Baradar
    Momtazpour, Mahmoud
    2024 32ND INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, ICEE 2024, 2024, : 859 - 864
  • [40] Algorithmic skeletons for multi-core, multi-GPU systems and clusters
    Ernsting, Steffen
    Kuchen, Herbert
    International Journal of High Performance Computing and Networking, 2012, 7 (02) : 129 - 138