Efficient Breadth First Search on Multi-GPU Systems Using GPU-Centric OpenSHMEM

被引：2

作者：

Potluri, Sreeram ^{[1
]}

Goswami, Anshuman ^{[1
]}

Venkata, Manjunath Gorentla ^{[2
]}

Imam, Neena ^{[2
]}

机构：

[1] NVIDIA Corp, Santa Clara, CA 95051 USA

[2] Oak Ridge Natl Lab, Comp Sci & Math Div, Oak Ridge, TN USA

来源：

OPENSHMEM AND RELATED TECHNOLOGIES: BIG COMPUTE AND BIG DATA CONVERGENCE, OPENSHMEM 2017 | 2018年 / 10679卷

关键词：

D O I：

10.1007/978-3-319-73814-7_6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

NVSHMEM is an implementation of OpenSHMEM for NVIDIA GPUs which allows communication to be issued from inside CUDA kernels. In this work, we present an implementation of Breadth First Search for multi-GPU systems using NVSHMEM. We analyze the benefits and bottlenecks of moving fine-grained communication into CUDA kernels. Using our implementation of BFS, we achieve up to 75% improvement in performance compared to a CUDA-aware MPI-based implementation, in the best case.

引用

页码：82 / 96

页数：15

共 50 条

[31] Gene regulatory networks inference using a multi-GPU exhaustive search algorithm
Fabrizio F Borelli
Raphael Y de Camargo
David C Martins
Luiz CS Rozante
BMC Bioinformatics, 14
[32] PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems
Zhang, Yuanxing
Chen, Langshi
Yang, Siran
Yuan, Man
Yi, Huimin
Zhang, Jie
Wang, Jiamang
Dong, Jianbo
Xu, Yunlong
Song, Yue
Li, Yong
Zhang, Di
Lin, Wei
Qu, Lin
Zheng, Bo
2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 3453 - 3466
[33] Simulating cortical networks on heterogeneous multi-GPU systems
Nere, Andrew
Franey, Sean
Hashmi, Atif
Lipasti, Mikko
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (07) : 953 - 971
[34] GPU-Chariot: A Programming Framework for Stream Applications Running on Multi-GPU Systems
Ino, Fumihiko
Nakagawa, Shinta
Hagihara, Kenichi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (12): : 2604 - 2616
[35] Accelerated MR Physics Simulations on multi-GPU systems
Xanthis, Christos G.
Venetis, Ioannis E.
Aletras, Anthony H.
2013 IEEE 13TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2013,
[36] Performance Optimization of Allreduce Operation for Multi-GPU Systems
Nukada, Akira
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 3107 - 3112
[37] Autonomous Execution for Multi-GPU Systems: Compiler Support
Koç University, Istanbul, Turkey
不详
CA, United States
Proc. SC -W: Workshops Int. Conf. High Perform. Comput., Netw., Storage Anal., (1129-1140):
[38] Dynamic load balancing on heterogeneous multi-GPU systems
Acosta, Alejandro
Blanco, Vicente
Almeida, Francisco
COMPUTERS & ELECTRICAL ENGINEERING, 2013, 39 (08) : 2591 - 2602
[39] Tensor Movement Orchestration in Multi-GPU Training Systems
Lin, Shao-Fu
Chen, Yi-Jung
Cheng, Hsiang-Yun
Yang, Chia-Lin
2023 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA, 2023, : 1140 - 1152
[40] Solving Multiple Tridiagonal Systems on a Multi-GPU Platform
Dieguez, Adrian P.
Amor, Margarita
Doallo, Ramon
2018 26TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2018), 2018, : 759 - 763

← 1 2 3 4 5 →