Design of a Hybrid MPI-CUDA Benchmark Suite for CPU-GPU Clusters

被引：5

作者：

Agarwal, Tejaswi ^{[1
]}

Becchi, Michela ^{[1
]}

机构：

[1] Univ Missouri, Columbia, MO 65211 USA

来源：

PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'14) | 2014年

关键词：

Benchmark; CUDA-MPI; clusters; GPU;

D O I：

10.1145/2628071.2671423

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the last few years, GPUs have become an integral part of HPC clusters. To test these heterogeneous CPU-GPU systems, we designed a hybrid CUDA-MPI benchmark suite that consists of three communication-and compute-intensive applications: Matrix Multiplication (MM), Needleman-Wunsch (NW) and the ADFA compression algorithm [1]. The main goal of this work is to characterize these workloads on CPU-GPU clusters. Our benchmark applications are designed to allow cluster administrators to identify bottlenecks in the cluster, to decide if scaling applications to multiple nodes would improve or decrease overall throughput and to design effective scheduling policies. Our experiments show that inter-node communication can significantly degrade the throughput of communication-intensive applications. We conclude that the scalability of the applications depends primarily on two factors: the cluster configuration and the applications characteristics.

引用

页码：505 / 506

页数：2

共 50 条

[21] A Flexible Scheduling Framework for Heterogeneous CPU-GPU Clusters
Sajjapongse, Kittisak
Agarwal, Tejaswi
Becchi, Michela
2014 21ST INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2014,
[22] Energy Efficient Real-time Task Scheduling on CPU-GPU Hybrid Clusters
Mei, Xinxin
Chu, Xiaowen
Liu, Hai
Leung, Yiu-Wing
Li, Zongpeng
IEEE INFOCOM 2017 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS, 2017,
[23] The Unicorn Runtime: Efficient Distributed Shared Memory Programming for Hybrid CPU-GPU Clusters
Beri, Tarun
Bansal, Sorav
Kumar, Subodh
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (05) : 1518 - 1534
[24] Design of a simulation model for high performance LINPACK in hybrid CPU-GPU systems
Hu, Yichang
Lu, Lu
JOURNAL OF SUPERCOMPUTING, 2021, 77 (12): : 13739 - 13756
[25] Evaluation of NDVI and NDWI parameters in CPU-GPU Heterogeneous Platforms based CUDA
Guerrouj, Fatima Zahra
Latif, Rachid
Saddik, Amine
PROCEEDINGS OF 2020 5TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND ARTIFICIAL INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS (CLOUDTECH'20), 2020, : 74 - 79
[26] Design of a simulation model for high performance LINPACK in hybrid CPU-GPU systems
Yichang Hu
Lu Lu
The Journal of Supercomputing, 2021, 77 : 13739 - 13756
[27] Performance Improvement of CUDA Applications by Reducing CPU-GPU Data Transfer Overhead
Sunitha, N., V
Raju, K.
Chiplunkar, Niranjan N.
PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICICCT), 2017, : 211 - 215
[28] A CPU-GPU hybrid approach for the unsymmetric multifrontal method
Yu, Chenhan D.
Wang, Weichung
Pierce, Dan'l
PARALLEL COMPUTING, 2011, 37 (12) : 759 - 770
[29] Boosting CUDA Applications with CPU–GPU Hybrid Computing
Changmin Lee
Won Woo Ro
Jean-Luc Gaudiot
International Journal of Parallel Programming, 2014, 42 : 384 - 404
[30] HyDetect: A Hybrid CPU-GPU Algorithm for Community Detection
Bhowmik, Anwesha
Vadhiyar, Sathish
2019 IEEE 26TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC), 2019, : 2 - 11

← 1 2 3 4 5 →