Characterization of data movement requirements for sparse matrix computations on GPUs

被引：4

作者：

Kurt, Sureyya Emre ^{[1
]}

Thumma, Vineeth ^{[1
]}

Hong, Changwan ^{[1
]}

Sukumaran-Rajam, Aravind ^{[1
]}

Sadayappan, P. ^{[1
]}

机构：

[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA

来源：

2017 IEEE 24TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC) | 2017年

基金：

美国国家科学基金会;

关键词：

data-movement bounds; sparse matrix-vector multiplication (SpMV); sparse matrix-matrix multiplication (SpGEMM); graph analytics; hypergraph partitioning; GPU computing;

D O I：

10.1109/HiPC.2017.00040

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Tight data movement lower bounds are known for dense matrix-vector multiplication and dense matrix-matrix multiplication and practical implementations exist on GPUs that achieve performance quite close to the roofline bounds based on operational intensity. For large dense matrices, matrix-vector multiplication is bandwidth-limited and its performance is significantly lower than matrix-matrix multiplication. However, in contrast, the performance of sparse matrix-matrix multiplication (SpGEMM) is generally much lower than that of sparse matrix-vector multiplication (SpMV). In this paper, we use a combination of lower-bounds and upper-bounds analysis of data movement requirements, as well as hardware counter based measurements to gain insights into the performance limitations of existing implementations for SpGEMM on GPUs. The analysis motivates the development of an adaptive work distribution strategy among threads and results in performance enhancement for SpGEMM code on GPUs.

引用

页码：283 / 293

页数：11

共 50 条

[1] Addressing Irregular Patterns of Matrix Computations on GPUs and Their Impact on Applications Powered by Sparse Direct Solvers
Abdelfattah, Ahmad
Ghysels, Pieter
Boukaram, Wajih
Tomov, Stanimire
Li, Xiaoye Sherry
Dongarra, Jack
SC22: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2022,
[2] Exploring Data Layout for Sparse Tensor Times Dense Matrix on GPUs
Ahmad, Khalid
Cecka, Cris
Garland, Michael
Hall, Mary
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2024, 21 (01)
[3] Redesigning Triangular Dense Matrix Computations on GPUs
Charara, Ali
Ltaief, Hatem
Keyes, David
EURO-PAR 2016: PARALLEL PROCESSING, 2016, 9833 : 477 - 489
[4] Automatic data structure selection and transformation for sparse matrix computations
Bik, AJC
Wijshoff, HAG
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1996, 7 (02) : 109 - 126
[5] SPARSE-MATRIX COMPUTATIONS
REID, JK
APPLICATIONS OF MATRIX THEORY, 1989, 22 : 101 - 121
[6] A Novel Data Transformation and Execution Strategy for Accelerating Sparse Matrix Multiplication on GPUs
Jiang, Peng
Hong, Changwan
Agrawal, Gagan
PROCEEDINGS OF THE 25TH ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING (PPOPP '20), 2020, : 376 - 388
[7] Batched matrix computations on hardware accelerators based on GPUs
Haidar, Azzam
Dong, Tingxing
Luszczek, Piotr
Tomov, Stanimire
Dongarra, Jack
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2015, 29 (02): : 193 - 208
[8] Data-driven Mixed Precision Sparse Matrix Vector Multiplication for GPUs
Ahmad, Khalid
Sundar, Hari
Hall, Mary
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2019, 16 (04)
[9] Exploiting the capabilities of modern GPUs for dense matrix computations
Barrachina, Sergio
Castillo, Maribel
Igual, Francisco D.
Mayo, Rafael
Quintana-Orti, Enrique S.
Quintana-Orti, Gregorio
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2009, 21 (18): : 2457 - 2477
[10] Automatic Selection of Sparse Matrix Representation on GPUs
Sedaghati, Naser
Mu, Te
Pouchet, Louis-Noel
Parthasarathy, Srinivasan
Sadayappan, P.
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS'15), 2015, : 99 - 108

← 1 2 3 4 5 →