Benchmarking multi-GPU applications on modern multi-GPU integrated systems

被引：1

作者：

Bernaschi, Massimo ^{[1
]}

Agostini, Elena ^{[2
]}

Rossetti, Davide ^{[2
]}

机构：

[1] CNR, I-00185 Rome, Italy

[2] NVIDIA, Santa Clara, CA USA

来源：

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2021年 / 33卷 / 14期

关键词：

approximate inverse; DGX-1; GPUDirec; POWER9; spin; LINEAR-SYSTEMS;

D O I：

10.1002/cpe.5470

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

GPUs are very powerful computing accelerators that are often employed in single-device configuration. However, there is a steadily growing interest in using multiple GPUs in a concurrent way both to overcome the memory limitations of the single device and to further reduce execution times. Until recently, communication among GPUs had been carried out mainly by using networking technologies originally devised for standard CPUs with the CPU playing an active role in the communication. However, new alternatives start to be available in which a moderate number of GPUs are directly connected each other by means of proprietary technologies. We present the results of a set of experiments aimed at assessing the performance of some of these hardware/software platforms using a particularly challenging application as a benchmark. We release its source code to facilitate people interested in reproducing or extending our results.

引用

页数：15

共 50 条

[31] Dynamic load balancing on heterogeneous multi-GPU systems
Acosta, Alejandro
Blanco, Vicente
Almeida, Francisco
COMPUTERS & ELECTRICAL ENGINEERING, 2013, 39 (08) : 2591 - 2602
[32] Moim: A Multi-GPU MapReduce Framework
Xie, Mengjun
Kang, Kyoung-Don
Basaran, Can
2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 1279 - 1286
[33] An introduction to multi-GPU programming for physicists
M. Bernaschi
M. Bisson
M. Fatica
E. Phillips
The European Physical Journal Special Topics, 2012, 210 : 17 - 31
[34] Efficient Implementation of MrBayes on Multi-GPU
Bao, Jie
Xia, Hongju
Zhou, Jianfu
Liu, Xiaoguang
Wang, Gang
MOLECULAR BIOLOGY AND EVOLUTION, 2013, 30 (06) : 1471 - 1479
[35] Tensor Movement Orchestration in Multi-GPU Training Systems
Lin, Shao-Fu
Chen, Yi-Jung
Cheng, Hsiang-Yun
Yang, Chia-Lin
2023 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA, 2023, : 1140 - 1152
[36] Gossip: Efficient Communication Primitives for Multi-GPU Systems
Kobus, Robin
Juenger, Daniel
Hundt, Christian
Schmidt, Bertil
PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,
[37] Solving Multiple Tridiagonal Systems on a Multi-GPU Platform
Dieguez, Adrian P.
Amor, Margarita
Doallo, Ramon
2018 26TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2018), 2018, : 759 - 763
[38] HPSM: A Programming Framework for Multi-CPU and Multi-GPU Systems
Lima, Joao V. F.
Di Domenico, Daniel
2017 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING WORKSHOPS (SBAC-PADW), 2017, : 31 - 36
[39] Multi-Objective Concurrent Kernel Scheduling for Multi-GPU Systems
Alizadeh, Negar Baradar
Momtazpour, Mahmoud
2024 32ND INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, ICEE 2024, 2024, : 859 - 864
[40] Algorithmic skeletons for multi-core, multi-GPU systems and clusters
Ernsting, Steffen
Kuchen, Herbert
International Journal of High Performance Computing and Networking, 2012, 7 (02) : 129 - 138

← 1 2 3 4 5 →