Efficient Breadth First Search on Multi-GPU Systems Using GPU-Centric OpenSHMEM

被引:2
|
作者
Potluri, Sreeram [1 ]
Goswami, Anshuman [1 ]
Venkata, Manjunath Gorentla [2 ]
Imam, Neena [2 ]
机构
[1] NVIDIA Corp, Santa Clara, CA 95051 USA
[2] Oak Ridge Natl Lab, Comp Sci & Math Div, Oak Ridge, TN USA
关键词
D O I
10.1007/978-3-319-73814-7_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
NVSHMEM is an implementation of OpenSHMEM for NVIDIA GPUs which allows communication to be issued from inside CUDA kernels. In this work, we present an implementation of Breadth First Search for multi-GPU systems using NVSHMEM. We analyze the benefits and bottlenecks of moving fine-grained communication into CUDA kernels. Using our implementation of BFS, we achieve up to 75% improvement in performance compared to a CUDA-aware MPI-based implementation, in the best case.
引用
收藏
页码:82 / 96
页数:15
相关论文
共 50 条
  • [31] Gene regulatory networks inference using a multi-GPU exhaustive search algorithm
    Fabrizio F Borelli
    Raphael Y de Camargo
    David C Martins
    Luiz CS Rozante
    BMC Bioinformatics, 14
  • [32] PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems
    Zhang, Yuanxing
    Chen, Langshi
    Yang, Siran
    Yuan, Man
    Yi, Huimin
    Zhang, Jie
    Wang, Jiamang
    Dong, Jianbo
    Xu, Yunlong
    Song, Yue
    Li, Yong
    Zhang, Di
    Lin, Wei
    Qu, Lin
    Zheng, Bo
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 3453 - 3466
  • [33] Simulating cortical networks on heterogeneous multi-GPU systems
    Nere, Andrew
    Franey, Sean
    Hashmi, Atif
    Lipasti, Mikko
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (07) : 953 - 971
  • [34] GPU-Chariot: A Programming Framework for Stream Applications Running on Multi-GPU Systems
    Ino, Fumihiko
    Nakagawa, Shinta
    Hagihara, Kenichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (12): : 2604 - 2616
  • [35] Accelerated MR Physics Simulations on multi-GPU systems
    Xanthis, Christos G.
    Venetis, Ioannis E.
    Aletras, Anthony H.
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2013,
  • [36] Performance Optimization of Allreduce Operation for Multi-GPU Systems
    Nukada, Akira
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 3107 - 3112
  • [37] Autonomous Execution for Multi-GPU Systems: Compiler Support
    Koç University, Istanbul, Turkey
    不详
    CA, United States
    Proc. SC -W: Workshops Int. Conf. High Perform. Comput., Netw., Storage Anal., (1129-1140):
  • [38] Dynamic load balancing on heterogeneous multi-GPU systems
    Acosta, Alejandro
    Blanco, Vicente
    Almeida, Francisco
    COMPUTERS & ELECTRICAL ENGINEERING, 2013, 39 (08) : 2591 - 2602
  • [39] Tensor Movement Orchestration in Multi-GPU Training Systems
    Lin, Shao-Fu
    Chen, Yi-Jung
    Cheng, Hsiang-Yun
    Yang, Chia-Lin
    2023 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA, 2023, : 1140 - 1152
  • [40] Solving Multiple Tridiagonal Systems on a Multi-GPU Platform
    Dieguez, Adrian P.
    Amor, Margarita
    Doallo, Ramon
    2018 26TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2018), 2018, : 759 - 763