Efficient CSR-Based Sparse Matrix-Vector Multiplication on GPU

被引：3

作者：

Gao, Jiaquan ^{[1
]}

Qi, Panpan ^{[2
]}

He, Guixia ^{[3
]}

机构：

[1] Nanjing Normal Univ, Sch Comp Sci & Technol, Nanjing 210023, Jiangsu, Peoples R China

[2] Zhejiang Univ Technol, Coll Comp Sci & Technol, Hangzhou 310023, Zhejiang, Peoples R China

[3] Zhejiang Univ Technol, Zhijiang Coll, Hangzhou 310024, Zhejiang, Peoples R China

来源：

MATHEMATICAL PROBLEMS IN ENGINEERING | 2016年 / 2016卷

关键词：

FORMAT; PERFORMANCE;

D O I：

10.1155/2016/4596943

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Sparse matrix-vector multiplication (SpMV) is an important operation in computational science and needs be accelerated because it often represents the dominant cost in many widely used iterative methods and eigenvalue problems. We achieve this objective by proposing a novel SpMV algorithm based on the compressed sparse row (CSR) on the GPU. Our method dynamically assigns different numbers of rows to each thread block and executes different optimization implementations on the basis of the number of rows it involves for each block. The process of accesses to the CSR arrays is fully coalesced, and the GPU's DRAM bandwidth is efficiently utilized by loading data into the shared memory, which alleviates the bottleneck of many existing CSR-based algorithms (i.e., CSR-scalar and CSR-vector). Test results on C2050 and K20c GPUs show that our method outperforms a perfect-CSR algorithm that inspires our work, the vendor tuned CUSPARSE V6.5 and CUSP V0.5.1, and three popular algorithms clSpMV, CSR5, and CSR-Adaptive.

引用

页数：14

共 50 条

[21] A New Segmentation-Based GPU-Accelerated Sparse Matrix-Vector Multiplication
He, Kai
Tan, Sheldon X-D
Tlelo-Cuautle, Esteban
Wang, Hai
Tang, He
2014 IEEE 57TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2014, : 1013 - 1016
[22] Sparse Matrix-Vector Multiplication on GPGPUs
Filippone, Salvatore
Cardellini, Valeria
Barbieri, Davide
Fanfarillo, Alessandro
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2017, 43 (04):
[23] CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication
Liu, Weifeng
Vinter, Brian
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS'15), 2015, : 339 - 350
[24] Efficient Sparse Matrix-Vector Multiplication on Intel PIUMA Architecture
Aananthakrishnan, Sriram
Pawlowski, Robert
Fryman, Joshua
Hur, Ibrahim
2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,
[25] An efficient SIMD compression format for sparse matrix-vector multiplication
Chen, Xinhai
Xie, Peizhen
Chi, Lihua
Liu, Jie
Gong, Chunye
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (23):
[26] Efficient FCM Computations Using Sparse Matrix-Vector Multiplication
Puheim, Michal
Vascak, Jan
Machova, Kristina
2016 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2016, : 4165 - 4170
[27] Merge-based Sparse Matrix-Vector Multiplication (SpMV) using the CSR Storage Format
Merrill, Duane
Garland, Michael
ACM SIGPLAN NOTICES, 2016, 51 (08) : 389 - 390
[28] Sparse Matrix-Vector Multiplication Based on Online Arithmetic
Cherati, Sahar Moradi
Jaberipur, Ghassem
Sousa, Leonel
IEEE ACCESS, 2024, 12 : 87653 - 87664
[29] Efficient Multicore Sparse Matrix-Vector Multiplication for FE Electromagnetics
Fernandez, David M.
Giannacopoulos, Dennis
Gross, Warren J.
IEEE TRANSACTIONS ON MAGNETICS, 2009, 45 (03) : 1392 - 1395
[30] IMAGE EDITING BASED ON SPARSE MATRIX-VECTOR MULTIPLICATION
Wang, Ying
Yan, Hongping
Pan, Chunhong
Xiang, Shiming
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 1317 - 1320

← 1 2 3 4 5 →