Characterization of data movement requirements for sparse matrix computations on GPUs

被引：4

作者：

Kurt, Sureyya Emre ^{[1
]}

Thumma, Vineeth ^{[1
]}

Hong, Changwan ^{[1
]}

Sukumaran-Rajam, Aravind ^{[1
]}

Sadayappan, P. ^{[1
]}

机构：

[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA

来源：

2017 IEEE 24TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC) | 2017年

基金：

美国国家科学基金会;

关键词：

data-movement bounds; sparse matrix-vector multiplication (SpMV); sparse matrix-matrix multiplication (SpGEMM); graph analytics; hypergraph partitioning; GPU computing;

D O I：

10.1109/HiPC.2017.00040

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Tight data movement lower bounds are known for dense matrix-vector multiplication and dense matrix-matrix multiplication and practical implementations exist on GPUs that achieve performance quite close to the roofline bounds based on operational intensity. For large dense matrices, matrix-vector multiplication is bandwidth-limited and its performance is significantly lower than matrix-matrix multiplication. However, in contrast, the performance of sparse matrix-matrix multiplication (SpGEMM) is generally much lower than that of sparse matrix-vector multiplication (SpMV). In this paper, we use a combination of lower-bounds and upper-bounds analysis of data movement requirements, as well as hardware counter based measurements to gain insights into the performance limitations of existing implementations for SpGEMM on GPUs. The analysis motivates the development of an adaptive work distribution strategy among threads and results in performance enhancement for SpGEMM code on GPUs.

引用

页码：283 / 293

页数：11

共 50 条

[41] Efficient Sparse-Dense Matrix-Matrix Multiplication on GPUs Using the Customized Sparse Storage Format
Shi, Shaohuai
Wang, Qiang
Chu, Xiaowen
2020 IEEE 26TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2020, : 19 - 26
[42] Automatic parallelization of sparse matrix computations: A static analysis
Adle, R
Aiguier, M
Delaplace, F
EURO-PAR 2000 PARALLEL PROCESSING, PROCEEDINGS, 2000, 1900 : 340 - 348
[43] A fast algorithm for sparse matrix computations related to inversion
Li, S.
Wu, W.
Darve, E.
JOURNAL OF COMPUTATIONAL PHYSICS, 2013, 242 : 915 - 945
[44] Sparse Matrix Computations Using the Quadtree Storage Format
Simecek, Ivan
11TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2009), 2009, : 168 - 173
[45] On the use of Java']Java arrays for sparse matrix computations
Gundersen, G
Steihaug, T
PARALLEL COMPUTING: SOFTWARE TECHNOLOGY, ALGORITHMS, ARCHITECTURES AND APPLICATIONS, 2004, 13 : 119 - 126
[46] Pattern-Aware Vectorization for Sparse Matrix Computations
Abdelaal, Khaled
2021 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2021, : 1026 - 1026
[47] ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism
Cheshmi, Kazem
Kamil, Shoaib
Strout, Michelle Mills
Dehnavi, Maryam Mehri
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE, AND ANALYSIS (SC'18), 2018,
[48] The SPARAMAT approach to automatic comprehension of sparse matrix computations
Kessler, CW
Smith, CH
SEVENTH INTERNATIONAL WORKSHOP ON PROGRAM COMPREHENSION, PROCEEDINGS, 1999, : 200 - 207
[49] SPARSE-MATRIX COMPUTATIONS ON THE HYPERCUBE AND RELATED NETWORKS
MANZINI, G
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1994, 21 (02) : 169 - 183
[50] SPARSE-MATRIX COMPUTATIONS ON PARALLEL PROCESSOR ARRAYS
OGIELSKI, AT
AIELLO, W
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1993, 14 (03): : 519 - 530

← 1 2 3 4 5 →