Efficient CSR-Based Sparse Matrix-Vector Multiplication on GPU

被引:3
|
作者
Gao, Jiaquan [1 ]
Qi, Panpan [2 ]
He, Guixia [3 ]
机构
[1] Nanjing Normal Univ, Sch Comp Sci & Technol, Nanjing 210023, Jiangsu, Peoples R China
[2] Zhejiang Univ Technol, Coll Comp Sci & Technol, Hangzhou 310023, Zhejiang, Peoples R China
[3] Zhejiang Univ Technol, Zhijiang Coll, Hangzhou 310024, Zhejiang, Peoples R China
关键词
FORMAT; PERFORMANCE;
D O I
10.1155/2016/4596943
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Sparse matrix-vector multiplication (SpMV) is an important operation in computational science and needs be accelerated because it often represents the dominant cost in many widely used iterative methods and eigenvalue problems. We achieve this objective by proposing a novel SpMV algorithm based on the compressed sparse row (CSR) on the GPU. Our method dynamically assigns different numbers of rows to each thread block and executes different optimization implementations on the basis of the number of rows it involves for each block. The process of accesses to the CSR arrays is fully coalesced, and the GPU's DRAM bandwidth is efficiently utilized by loading data into the shared memory, which alleviates the bottleneck of many existing CSR-based algorithms (i.e., CSR-scalar and CSR-vector). Test results on C2050 and K20c GPUs show that our method outperforms a perfect-CSR algorithm that inspires our work, the vendor tuned CUSPARSE V6.5 and CUSP V0.5.1, and three popular algorithms clSpMV, CSR5, and CSR-Adaptive.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Node aware sparse matrix-vector multiplication
    Bienz, Amanda
    Gropp, William D.
    Olson, Luke N.
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2019, 130 : 166 - 178
  • [42] STRUCTURED SPARSE MATRIX-VECTOR MULTIPLICATION ON A MASPAR
    DEHN, T
    EIERMANN, M
    GIEBERMANN, K
    SPERLING, V
    ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1994, 74 (06): : T534 - T538
  • [43] On sparse matrix-vector multiplication with FPGA-based system
    ElGindy, H
    Shue, YL
    10TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2002, : 273 - 274
  • [44] Performance Aspects of Sparse Matrix-Vector Multiplication
    Simecek, I.
    ACTA POLYTECHNICA, 2006, 46 (03) : 3 - 8
  • [45] A segment-based sparse matrix-vector multiplication on CUDA
    Feng, Xiaowen
    Jin, Hai
    Zheng, Ran
    Shao, Zhiyuan
    Zhu, Lei
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2014, 26 (01): : 271 - 286
  • [46] Merge-based Parallel Sparse Matrix-Vector Multiplication
    Merrill, Duane
    Garland, Michael
    SC '16: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2016, : 678 - 689
  • [47] On improving the performance of sparse matrix-vector multiplication
    White, JB
    Sadayappan, P
    FOURTH INTERNATIONAL CONFERENCE ON HIGH-PERFORMANCE COMPUTING, PROCEEDINGS, 1997, : 66 - 71
  • [48] Sparse matrix-vector multiplication -: Final solution?
    Simecek, Ivan
    Tvrdik, Pavel
    PARALLEL PROCESSING AND APPLIED MATHEMATICS, 2008, 4967 : 156 - 165
  • [49] A Novel Multi-GPU Parallel Optimization Model for The Sparse Matrix-Vector Multiplication
    Gao, Jiaquan
    Zhou, Yuanshen
    Wu, Kesong
    PARALLEL PROCESSING LETTERS, 2016, 26 (04)
  • [50] High-Performance Matrix-Vector Multiplication on the GPU
    Sorensen, Hans Henrik Brandenborg
    EURO-PAR 2011: PARALLEL PROCESSING WORKSHOPS, PT I, 2012, 7155 : 377 - 386