Characterization of data movement requirements for sparse matrix computations on GPUs

被引:4
|
作者
Kurt, Sureyya Emre [1 ]
Thumma, Vineeth [1 ]
Hong, Changwan [1 ]
Sukumaran-Rajam, Aravind [1 ]
Sadayappan, P. [1 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
基金
美国国家科学基金会;
关键词
data-movement bounds; sparse matrix-vector multiplication (SpMV); sparse matrix-matrix multiplication (SpGEMM); graph analytics; hypergraph partitioning; GPU computing;
D O I
10.1109/HiPC.2017.00040
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Tight data movement lower bounds are known for dense matrix-vector multiplication and dense matrix-matrix multiplication and practical implementations exist on GPUs that achieve performance quite close to the roofline bounds based on operational intensity. For large dense matrices, matrix-vector multiplication is bandwidth-limited and its performance is significantly lower than matrix-matrix multiplication. However, in contrast, the performance of sparse matrix-matrix multiplication (SpGEMM) is generally much lower than that of sparse matrix-vector multiplication (SpMV). In this paper, we use a combination of lower-bounds and upper-bounds analysis of data movement requirements, as well as hardware counter based measurements to gain insights into the performance limitations of existing implementations for SpGEMM on GPUs. The analysis motivates the development of an adaptive work distribution strategy among threads and results in performance enhancement for SpGEMM code on GPUs.
引用
收藏
页码:283 / 293
页数:11
相关论文
共 50 条
  • [21] Unleashing the performance of bmSparse for the sparse matrix multiplication in GPUs
    Berger, Gonzalo
    Freire, Manuel
    Marini, Renzo
    Dufrechou, Ernesto
    Ezzatti, Pablo
    PROCEEDINGS OF SCALA 2021: 12TH WORKSHOP ON LATEST ADVANCES IN SCALABLE ALGORITHMS FOR LARGE- SCALE SYSTEMS, 2021, : 19 - 26
  • [22] Sparse Matrix-Vector Product for the bmSparse Matrix Format in GPUs
    Berger, Gonzalo
    Dufrechou, Ernesto
    Ezzatti, Pablo
    EURO-PAR 2023: PARALLEL PROCESSING WORKSHOPS, PT I, EURO-PAR 2023, 2024, 14351 : 246 - 256
  • [23] Automating Wavefront Parallelization for Sparse Matrix Computations
    Venkat, Anand
    Mohammadi, Mandi Soltan
    Park, Jongsoo
    Rong, Hongbo
    Barik, Rajkishore
    Strout, Michelle Mills
    Hall, Mary
    SC '16: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2016, : 480 - 491
  • [24] Toward an automatic parallelization of sparse matrix computations
    Adle, R
    Aiguier, M
    Delaplace, F
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2005, 65 (03) : 313 - 330
  • [25] PREDICTING STRUCTURE IN SPARSE-MATRIX COMPUTATIONS
    GILBERT, JR
    SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 1994, 15 (01) : 62 - 79
  • [26] Sparse matrix computations for dynamic network centrality
    Arrigo F.
    Higham D.J.
    Arrigo, Francesca (francesca.arrigo@strath.ac.uk), 2017, Springer Science and Business Media Deutschland GmbH (02)
  • [27] Sparse matrix computations on manycore GPU's
    Garland, Michael
    2008 45TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, VOLS 1 AND 2, 2008, : 2 - 6
  • [28] Adaptive Optimization for Sparse Data on Heterogeneous GPUs
    Ma, Yujing
    Rusu, Florin
    Wu, Kesheng
    Sim, Alexander
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2022), 2022, : 1088 - 1097
  • [29] ON FINDING SUPERNODES FOR SPARSE-MATRIX COMPUTATIONS
    LIU, JWH
    NG, EG
    PEYTON, BW
    SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 1993, 14 (01) : 242 - 252
  • [30] Modelling the cache performance of sparse matrix computations
    Rauber, T
    Scholtes, C
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-V, 2000, : 2271 - 2277