Efficient Breadth First Search on Multi-GPU Systems Using GPU-Centric OpenSHMEM

被引:2
|
作者
Potluri, Sreeram [1 ]
Goswami, Anshuman [1 ]
Venkata, Manjunath Gorentla [2 ]
Imam, Neena [2 ]
机构
[1] NVIDIA Corp, Santa Clara, CA 95051 USA
[2] Oak Ridge Natl Lab, Comp Sci & Math Div, Oak Ridge, TN USA
关键词
D O I
10.1007/978-3-319-73814-7_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
NVSHMEM is an implementation of OpenSHMEM for NVIDIA GPUs which allows communication to be issued from inside CUDA kernels. In this work, we present an implementation of Breadth First Search for multi-GPU systems using NVSHMEM. We analyze the benefits and bottlenecks of moving fine-grained communication into CUDA kernels. Using our implementation of BFS, we achieve up to 75% improvement in performance compared to a CUDA-aware MPI-based implementation, in the best case.
引用
收藏
页码:82 / 96
页数:15
相关论文
共 50 条
  • [21] Multi-GPU codes for spin systems simulations
    Bernaschi, M.
    Fatica, M.
    Parisi, G.
    Parisi, L.
    COMPUTER PHYSICS COMMUNICATIONS, 2012, 183 (07) : 1416 - 1421
  • [22] Accelerating MapReduce framework on multi-GPU systems
    Hai Jiang
    Yi Chen
    Zhi Qiao
    Kuan-Ching Li
    WonWoo Ro
    Jean-Luc Gaudiot
    Cluster Computing, 2014, 17 : 293 - 301
  • [23] Accelerating MapReduce framework on multi-GPU systems
    Jiang, Hai
    Chen, Yi
    Qiao, Zhi
    Li, Kuan-Ching
    Ro, WonWoo
    Gaudiot, Jean-Luc
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2014, 17 (02): : 293 - 301
  • [24] An Empirical Evaluation of Allgatherv on Multi-GPU Systems
    Rolinger, Thomas B.
    Simon, Tyler A.
    Krieger, Christopher D.
    2018 18TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2018, : 123 - 132
  • [25] Scalable Betweenness Centrality on Multi-GPU systems
    Bernaschi, Massimo
    Carbone, Giancarlo
    Vella, Flavio
    PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS (CF'16), 2016, : 29 - 36
  • [26] A Multi-GPU PCISPH Implementation with Efficient Memory Transfers
    Verma, Kevin
    Peng, Chong
    Szewc, Kamil
    Wille, Robert
    2018 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2018,
  • [27] Breadth-First Search on Dynamic Graphs using Dynamic Parallelism on the GPU
    Toedling, Dominik
    Winter, Martin
    Steinberger, Markus
    2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
  • [28] A multi-GPU protein database search model with hybrid alignment manner on distributed GPU clusters
    Zhou, Wei
    Cai, Zhanxiu
    Lian, Bo
    Wang, Jincai
    Ma, Jianping
    Sun, Bin
    Yu, Qian
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (18):
  • [29] GreenMD: Energy-efficient Matrix Decomposition on Heterogeneous Multi-GPU Systems
    Zamani, Hadi
    Bhuyan, Laxmi
    Chen, Jieyang
    Chen, Zizhong
    ACM TRANSACTIONS ON PARALLEL COMPUTING, 2023, 10 (02)
  • [30] Gene regulatory networks inference using a multi-GPU exhaustive search algorithm
    Borelli, Fabrizio F.
    de Camargo, Raphael Y.
    Martins, David C., Jr.
    Rozante, Luiz C. S.
    BMC BIOINFORMATICS, 2013, 14