Reduce, Reuse, and Adapt: Accelerating Graph Processing on GPUs

被引：0

作者：

Ullas, A. ^{[1
]}

Nasre, Rupesh ^{[2
]}

Govindarajan, R. ^{[1
]}

机构：

[1] Indian Inst Sci, Comp Sci & Automat, Bangalore, Karnataka, India

[2] Indian Inst Technol Madras, Comp Sci & Engn, Madras, Tamil Nadu, India

来源：

2023 IEEE 30TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS, HIPC 2023 | 2023年

关键词：

Graphs; Connected Components; GPU; Push; Pull; Hybrid; SSSP; BFS; PR; CC;

D O I：

10.1109/HiPC58850.2023.00050

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Designing parallel graph algorithms on GPUs has been challenging. We observe three limitations with the existing work. First, algorithms often rely on only one of the strategies to propagate information: push or pull. We observe that neither is an optimal choice in many cases. Second, the cost of updating the underlying data structures per iteration is high. This results in a significant performance overhead. Third, considering the inherent irregularity of graph processing, one-size-fits-all approach is too rigid for different types of graphs. In this work, we address these shortcomings by improving the processing of an existing graph framework, Subway. In particular, we propose a novel technique in terms of amalgamating the two propagation strategies (push and pull) into a hybrid traversal strategy. In this, the vertices of the graph propagate their information by pulling the information from the neighbours, performing a local computation, and subsequently pushing the result to all the neighbours, all within an iteration. We propose to reuse the SubCSR structure in Subway across a few iterations to significantly reduce the computational overhead, but without compromising the correctness or efficiency of the algorithm. Furthermore, we explore heuristics on when to use push, pull, or hybrid traversal strategies. We illustrate the effectiveness of our three-pronged approach by applying it to four popular graph algorithms: Connected Components (CC), Single-Source Shortest Path (SSSP), Breadth First Search (BFS) and Page Rank (PR) on an NVIDIA GeForce RTX 3060 GPU. Our extensive experimental evaluation on GeForce RTX 3060 GPU reveals that the proposed hybrid approach with adaptive heuristics and approximate subCSR computation is effective in reducing the execution time of CC, SSSP, and PR by 31%, 7.56%, and 6.43% respectively, compared to the minimum of push or pull algorithm that uses subCSR structure.

引用

页码：335 / 346

页数：12

共 50 条

[1] GraphPEG: Accelerating Graph Processing on GPUs
Lu, Yashuai
Guo, Hui
Huang, Libo
Yu, Qi
Shen, Li
Xiao, Nong
Wang, Zhiying
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2021, 18 (03)
[2] Accelerating Unstructured Graph Data Processing on GPUs
Pan, Xiaohui
2ND INTERNATIONAL CONFERENCE ON SIMULATION AND MODELING METHODOLOGIES, TECHNOLOGIES AND APPLICATIONS (SMTA 2015), 2015, : 29 - 33
[3] Accelerating Matrix Processing with GPUs
Malaya, Nicholas
Che, Shuai
Greathouse, Joseph L.
van Oostrum, Rene
Schulte, Michael J.
2017 IEEE 24TH SYMPOSIUM ON COMPUTER ARITHMETIC (ARITH), 2017, : 139 - 141
[4] Accelerating Dynamic Graph Analytics on GPUs
Sha, Mo
Li, Yuchen
He, Bingsheng
Tan, Kian-Lee
PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 11 (01): : 107 - 120
[5] Accelerating Graph Sampling for Graph Machine Learning using GPUs
Jangda, Abhinav
Polisetty, Sandeep
Guha, Arjun
Serafini, Marco
PROCEEDINGS OF THE SIXTEENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS (EUROSYS '21), 2021, : 311 - 326
[6] Optimizing Graph Processing on GPUs
Zhong, Wenyong
Sun, Jianhua
Chen, Hao
Xiao, Jun
Chen, Zhiwen
Cheng, Chang
Shi, Xuanhua
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (04) : 1149 - 1162
[7] Graph Processing on GPUs: A Survey
Shi, Xuanhua
Zheng, Zhigao
Zhou, Yongluan
Jin, Hai
He, Ligang
Liu, Bo
Hua, Qiang-Sheng
ACM COMPUTING SURVEYS, 2018, 50 (06)
[8] Accelerating matrix-centric graph processing on GPUs through bit-level optimizations
Chen, Jou-An
Sung, Hsin-Hsuan
Shen, Xipeng
Tallent, Nathan
Barker, Kevin
Li, Ang
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2023, 177 : 53 - 67
[9] Accelerating Complex Event Processing through GPUs
Rodrigo, Prabodha Srimal
Bandara, H. M. N. Dilum
Perera, Srinath
2015 IEEE 22ND INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2015, : 325 - 334
[10] MultiGraph: Efficient Graph Processing on GPUs
Hong, Changwan
Sukumaran-Rajam, Aravind
Kim, Jinsung
Sadayappan, P.
2017 26TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT), 2017, : 27 - 40

← 1 2 3 4 5 →