Design of a Hybrid MPI-CUDA Benchmark Suite for CPU-GPU Clusters

被引:5
|
作者
Agarwal, Tejaswi [1 ]
Becchi, Michela [1 ]
机构
[1] Univ Missouri, Columbia, MO 65211 USA
关键词
Benchmark; CUDA-MPI; clusters; GPU;
D O I
10.1145/2628071.2671423
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the last few years, GPUs have become an integral part of HPC clusters. To test these heterogeneous CPU-GPU systems, we designed a hybrid CUDA-MPI benchmark suite that consists of three communication-and compute-intensive applications: Matrix Multiplication (MM), Needleman-Wunsch (NW) and the ADFA compression algorithm [1]. The main goal of this work is to characterize these workloads on CPU-GPU clusters. Our benchmark applications are designed to allow cluster administrators to identify bottlenecks in the cluster, to decide if scaling applications to multiple nodes would improve or decrease overall throughput and to design effective scheduling policies. Our experiments show that inter-node communication can significantly degrade the throughput of communication-intensive applications. We conclude that the scalability of the applications depends primarily on two factors: the cluster configuration and the applications characteristics.
引用
收藏
页码:505 / 506
页数:2
相关论文
共 50 条
  • [1] Boosting CUDA Applications with CPU-GPU Hybrid Computing
    Lee, Changmin
    Ro, Won Woo
    Gaudiot, Jean-Luc
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2014, 42 (02) : 384 - 404
  • [2] Hetero-Mark, A Benchmark Suite for CPU-GPU Collaborative Computing
    Sun, Yifan
    Gong, Xiang
    Ziabari, Amir Kavyan
    Yu, Leiming
    Li, Xiangyu
    Mukherjee, Saoni
    McCardwell, Carter
    Villegas, Alejandro
    Kaeli, David
    PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION, 2016, : 13 - 22
  • [3] iMLBench: A Machine Learning Benchmark Suite for CPU-GPU Integrated Architectures
    Zhang, Chenyang
    Zhang, Feng
    Guo, Xiaoguang
    He, Bingsheng
    Zhang, Xiao
    Du, Xiaoyong
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (07) : 1740 - 1752
  • [4] Design and implementation of a hybrid MPI-CUDA model for the Smith-Waterman algorithm
    Khaled, Heba
    Faheem, Hossam El Deen Mostafa
    El Gohary, Rania
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 12 (03) : 313 - 327
  • [5] Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters
    Yang, Chao-Tung
    Huang, Chih-Lin
    Lin, Cheng-Fang
    COMPUTER PHYSICS COMMUNICATIONS, 2011, 182 (01) : 266 - 269
  • [6] An Open MPI Extension for Supporting Task Based Parallelism in Heterogeneous CPU-GPU Clusters
    Cabello, Uriel
    Rodriguez, Jose
    Meneses-Viveros, Amilcar
    HIGH PERFORMANCE COMPUTER APPLICATIONS, 2016, 595 : 144 - 155
  • [7] Heterogeneous programming using OpenMP and CUDA/HIP for hybrid CPU-GPU scientific applications
    Tallada, Marc Gonzalez
    Morancho, Enric
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2023, 37 (05): : 626 - 646
  • [8] Scalable Critical Path Analysis for Hybrid MPI-CUDA Applications
    Schmitt, Felix
    Dietrich, Robert
    Juckeland, Guido
    PROCEEDINGS OF 2014 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2014, : 909 - 916
  • [9] Accelerating Spatial Cross-Matching on CPU-GPU Hybrid Platform With CUDA and OpenACC
    Baig, Furqan
    Gao, Chao
    Teng, Dejun
    Kong, Jun
    Wang, Fusheng
    FRONTIERS IN BIG DATA, 2020, 3
  • [10] High efficient sedimentary basin simulations on hybrid CPU-GPU clusters
    Mei Wen
    Huayou Su
    Wenjie Wei
    Nan Wu
    Xing Cai
    Chunyuan Zhang
    Cluster Computing, 2014, 17 : 359 - 369