Design of a Hybrid MPI-CUDA Benchmark Suite for CPU-GPU Clusters

被引:5
|
作者
Agarwal, Tejaswi [1 ]
Becchi, Michela [1 ]
机构
[1] Univ Missouri, Columbia, MO 65211 USA
关键词
Benchmark; CUDA-MPI; clusters; GPU;
D O I
10.1145/2628071.2671423
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the last few years, GPUs have become an integral part of HPC clusters. To test these heterogeneous CPU-GPU systems, we designed a hybrid CUDA-MPI benchmark suite that consists of three communication-and compute-intensive applications: Matrix Multiplication (MM), Needleman-Wunsch (NW) and the ADFA compression algorithm [1]. The main goal of this work is to characterize these workloads on CPU-GPU clusters. Our benchmark applications are designed to allow cluster administrators to identify bottlenecks in the cluster, to decide if scaling applications to multiple nodes would improve or decrease overall throughput and to design effective scheduling policies. Our experiments show that inter-node communication can significantly degrade the throughput of communication-intensive applications. We conclude that the scalability of the applications depends primarily on two factors: the cluster configuration and the applications characteristics.
引用
收藏
页码:505 / 506
页数:2
相关论文
共 50 条
  • [31] CPU-GPU hybrid parallel strategy for cosmological simulations
    Wang, Yueqing
    Dou, Yong
    Guo, Song
    Lei, Yuanwu
    Zou, Dan
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2014, 26 (03): : 748 - 765
  • [32] Hybrid CPU-GPU Community Detection in Weighted Networks
    Souravlas, Stavros
    Sifaleras, Angelo
    Katsavounis, Stefanos
    IEEE ACCESS, 2020, 8 : 57527 - 57551
  • [33] Hybrid CPU-GPU scheduling and execution of tree traversals
    Liu, Jianqiao
    Hegde, Nikhil
    Kulkarni, Milind
    ACM SIGPLAN NOTICES, 2016, 51 (08) : 385 - 386
  • [34] Scalable critical-path analysis and optimization guidance for hybrid MPI-CUDA applications
    Schmitt, Felix
    Dietrich, Robert
    Juckeland, Guido
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2017, 31 (06): : 485 - 498
  • [35] Distributed multi-scale muscle simulation in a hybrid MPI-CUDA computational environment
    Ivanovic, Milos
    Stojanovic, Boban
    Kaplarevic-Malisic, Ana
    Gilbert, Richard
    Mijailovich, Srboljub
    SIMULATION-TRANSACTIONS OF THE SOCIETY FOR MODELING AND SIMULATION INTERNATIONAL, 2016, 92 (01): : 19 - 31
  • [36] Implementing Delay Multiply and Sum Beamformer on a Hybrid CPU-GPU Platform for Medical Ultrasound Imaging Using OpenMP and CUDA
    Song, Ke
    Liu, Paul
    Liu, Dongquan
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2021, 128 (03): : 1133 - 1150
  • [37] Hybrid CPU-GPU Computation of Adjoint Derivatives in Time Domain
    Statz, Christoph
    Muetze, Marco
    Hegler, Sebastian
    Plettemeier, Dirk
    2013 COMPUTATIONAL ELECTROMAGNETICS WORKSHOP (CEM'13), 2013, : 32 - 33
  • [38] Optimizing tensor contraction expressions for hybrid CPU-GPU execution
    Wenjing Ma
    Sriram Krishnamoorthy
    Oreste Villa
    Karol Kowalski
    Gagan Agrawal
    Cluster Computing, 2013, 16 : 131 - 155
  • [39] A hybrid CPU-GPU paradigm to accelerate reactive CFD simulations
    Ghioldi, Federico
    Piscaglia, Federico
    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, 2024, 96 (08) : 1461 - 1488
  • [40] CPU-GPU Hybrid Parallel Binomial American Option Pricing
    Zhang, Nan
    Lim, Eng Gee
    Man, Ka Lok
    Lei, Chi-Un
    INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTIST, IMECS 2012, VOL II, 2012, : 1157 - 1162