Efficient Breadth First Search on Multi-GPU Systems Using GPU-Centric OpenSHMEM

被引：2

作者：

Potluri, Sreeram ^{[1
]}

Goswami, Anshuman ^{[1
]}

Venkata, Manjunath Gorentla ^{[2
]}

Imam, Neena ^{[2
]}

机构：

[1] NVIDIA Corp, Santa Clara, CA 95051 USA

[2] Oak Ridge Natl Lab, Comp Sci & Math Div, Oak Ridge, TN USA

来源：

OPENSHMEM AND RELATED TECHNOLOGIES: BIG COMPUTE AND BIG DATA CONVERGENCE, OPENSHMEM 2017 | 2018年 / 10679卷

关键词：

D O I：

10.1007/978-3-319-73814-7_6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

NVSHMEM is an implementation of OpenSHMEM for NVIDIA GPUs which allows communication to be issued from inside CUDA kernels. In this work, we present an implementation of Breadth First Search for multi-GPU systems using NVSHMEM. We analyze the benefits and bottlenecks of moving fine-grained communication into CUDA kernels. Using our implementation of BFS, we achieve up to 75% improvement in performance compared to a CUDA-aware MPI-based implementation, in the best case.

引用

页码：82 / 96

页数：15

共 50 条

[21] Multi-GPU codes for spin systems simulations
Bernaschi, M.
Fatica, M.
Parisi, G.
Parisi, L.
COMPUTER PHYSICS COMMUNICATIONS, 2012, 183 (07) : 1416 - 1421
[22] Accelerating MapReduce framework on multi-GPU systems
Hai Jiang
Yi Chen
Zhi Qiao
Kuan-Ching Li
WonWoo Ro
Jean-Luc Gaudiot
Cluster Computing, 2014, 17 : 293 - 301
[23] Accelerating MapReduce framework on multi-GPU systems
Jiang, Hai
Chen, Yi
Qiao, Zhi
Li, Kuan-Ching
Ro, WonWoo
Gaudiot, Jean-Luc
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2014, 17 (02): : 293 - 301
[24] An Empirical Evaluation of Allgatherv on Multi-GPU Systems
Rolinger, Thomas B.
Simon, Tyler A.
Krieger, Christopher D.
2018 18TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2018, : 123 - 132
[25] Scalable Betweenness Centrality on Multi-GPU systems
Bernaschi, Massimo
Carbone, Giancarlo
Vella, Flavio
PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS (CF'16), 2016, : 29 - 36
[26] A Multi-GPU PCISPH Implementation with Efficient Memory Transfers
Verma, Kevin
Peng, Chong
Szewc, Kamil
Wille, Robert
2018 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2018,
[27] Breadth-First Search on Dynamic Graphs using Dynamic Parallelism on the GPU
Toedling, Dominik
Winter, Martin
Steinberger, Markus
2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
[28] A multi-GPU protein database search model with hybrid alignment manner on distributed GPU clusters
Zhou, Wei
Cai, Zhanxiu
Lian, Bo
Wang, Jincai
Ma, Jianping
Sun, Bin
Yu, Qian
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (18):
[29] GreenMD: Energy-efficient Matrix Decomposition on Heterogeneous Multi-GPU Systems
Zamani, Hadi
Bhuyan, Laxmi
Chen, Jieyang
Chen, Zizhong
ACM TRANSACTIONS ON PARALLEL COMPUTING, 2023, 10 (02)
[30] Gene regulatory networks inference using a multi-GPU exhaustive search algorithm
Borelli, Fabrizio F.
de Camargo, Raphael Y.
Martins, David C., Jr.
Rozante, Luiz C. S.
BMC BIOINFORMATICS, 2013, 14

← 1 2 3 4 5 →