Efficient Breadth First Search on Multi-GPU Systems Using GPU-Centric OpenSHMEM

被引:2
|
作者
Potluri, Sreeram [1 ]
Goswami, Anshuman [1 ]
Venkata, Manjunath Gorentla [2 ]
Imam, Neena [2 ]
机构
[1] NVIDIA Corp, Santa Clara, CA 95051 USA
[2] Oak Ridge Natl Lab, Comp Sci & Math Div, Oak Ridge, TN USA
关键词
D O I
10.1007/978-3-319-73814-7_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
NVSHMEM is an implementation of OpenSHMEM for NVIDIA GPUs which allows communication to be issued from inside CUDA kernels. In this work, we present an implementation of Breadth First Search for multi-GPU systems using NVSHMEM. We analyze the benefits and bottlenecks of moving fine-grained communication into CUDA kernels. Using our implementation of BFS, we achieve up to 75% improvement in performance compared to a CUDA-aware MPI-based implementation, in the best case.
引用
收藏
页码:82 / 96
页数:15
相关论文
共 50 条
  • [41] An efficient scheme for multi-GPU TTI reverse time migration
    Guo-Feng Liu
    Xiao-Hong Meng
    Zhen-Jiang Yu
    Ding-Jin Liu
    Applied Geophysics, 2019, 16 : 56 - 63
  • [42] Efficient Multi-GPU Memory Management for Deep Learning Acceleration
    Kim, Youngrang
    Lee, Jaehwan
    Kim, Jik-Soo
    Jei, Hyunseung
    Roh, Hongchan
    2018 IEEE 3RD INTERNATIONAL WORKSHOPS ON FOUNDATIONS AND APPLICATIONS OF SELF* SYSTEMS (FAS*W), 2018, : 37 - 43
  • [43] An efficient scheme for multi-GPU TTI reverse time migration
    Liu Guo-Feng
    Meng Xiao-Hong
    Yu Zhen-Jiang
    Liu Ding-Jin
    APPLIED GEOPHYSICS, 2019, 16 (01) : 56 - 63
  • [44] An efficient parallel collaborative filtering algorithm on multi-GPU platform
    Wang, Zhongya
    Liu, Ying
    Chiu, Steve
    JOURNAL OF SUPERCOMPUTING, 2016, 72 (06): : 2080 - 2094
  • [45] Efficient model of tumor dynamics simulated in multi-GPU environment
    Klusek, Adrian
    Los, Marcin
    Paszynski, Maciej
    Dzwinel, Witold
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2019, 33 (03): : 489 - 506
  • [46] Performance optimization of High-Performance LINPACK based on GPU-centric model on heterogeneous systems
    Huang, Jiawen
    Lu, Lu
    2019 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2019), 2019, : 1371 - 1377
  • [47] Efficient implementation of data flow graphs on multi-gpu clusters
    Vincent Boulos
    Sylvain Huet
    Vincent Fristot
    Luc Salvo
    Dominique Houzet
    Journal of Real-Time Image Processing, 2014, 9 : 217 - 232
  • [48] Efficient implementation of data flow graphs on multi-gpu clusters
    Boulos, Vincent
    Huet, Sylvain
    Fristot, Vincent
    Salvo, Luc
    Houzet, Dominique
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2014, 9 (01) : 217 - 232
  • [49] An efficient parallel collaborative filtering algorithm on multi-GPU platform
    Zhongya Wang
    Ying Liu
    Steve Chiu
    The Journal of Supercomputing, 2016, 72 : 2080 - 2094
  • [50] Multi-GPU Kinetic Solvers using MPI and CUDA
    Zabelok, Sergey
    Arslanbekov, Robert
    Kolobov, Vladimir
    PROCEEDINGS OF THE 29TH INTERNATIONAL SYMPOSIUM ON RAREFIED GAS DYNAMICS, 2014, 1628 : 539 - 546