Accelerating sparse matrix-matrix multiplication with GPU Tensor Cores

被引:38
|
作者
Zachariadis, Orestis [1 ]
Satpute, Nitin [1 ]
Gomez-Luna, Juan [2 ]
Olivares, Joaquin [1 ]
机构
[1] Univ Cordoba, Dept Elect & Comp Engn, Cordoba, Spain
[2] Swiss Fed Inst Technol, Dept Comp Sci, Zurich, Switzerland
基金
欧盟地平线“2020”;
关键词
Sparse matrix multiplication; GPU; Tensor Cores; Parallel computing; SpGEMM; MANY-CORE;
D O I
10.1016/j.compeleceng.2020.106848
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Sparse general matrix-matrix multiplication (spGEMM) is an essential component in many scientific and data analytics applications. However, the sparsity pattern of the input matrices and the interaction of their patterns make spGEMM challenging. Modern GPUs include Tensor Core Units (TCUs), which specialize in dense matrix multiplication. Our aim is to re-purpose TCUs for sparse matrices. The key idea of our spGEMM algorithm, tSparse, is to multiply sparse rectangular blocks using the mixed precision mode of TCUs. tSparse partitions the input matrices into files and operates only on files which contain one or more elements. It creates a task list of the files, and performs matrix multiplication of these files using TCUs. To the best of our knowledge, this is the first time that TCUs are used in the context of spGEMM. We show that spGEMM, with our filing approach, benefits from TCUs. Our approach significantly improves the performance of spGEMM in comparison to cuSPARSE, CUSP, RMerge2, Nsparse, AC-SpGEMM and spECK.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] DeltaSPARSE: High-Performance Sparse General Matrix-Matrix Multiplication on Multi-GPU Systems
    Yang, Shuai
    Zhang, Changyou
    Ma, Ji
    2023 IEEE 30TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS, HIPC 2023, 2023, : 194 - 202
  • [22] Accelerating Sparse Matrix-Matrix Multiplication with 3D-Stacked Logic-in-Memory Hardware
    Zhu, Qiuling
    Graf, Tobias
    Sumbul, H. Ekin
    Pileggi, Larry
    Franchetti, Franz
    2013 IEEE CONFERENCE ON HIGH PERFORMANCE EXTREME COMPUTING (HPEC), 2013,
  • [23] Design space exploration for sparse matrix-matrix multiplication on FPGAs
    Lin, Colin Yu
    Wong, Ngai
    So, Hayden Kwok-Hay
    INTERNATIONAL JOURNAL OF CIRCUIT THEORY AND APPLICATIONS, 2013, 41 (02) : 205 - 219
  • [24] EXPLOITING MULTIPLE LEVELS OF PARALLELISM IN SPARSE MATRIX-MATRIX MULTIPLICATION
    Azad, Ariful
    Ballard, Grey
    Buluc, Aydin
    Demmel, James
    Grigori, Laura
    Schwartz, Oded
    Toledo, Sivan
    Williams, Samuel
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2016, 38 (06): : C624 - C651
  • [25] PARALLEL SPARSE MATRIX-MATRIX MULTIPLICATION AND INDEXING: IMPLEMENTATION AND EXPERIMENTS
    Buluc, Aydin
    Gilbert, John R.
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2012, 34 (04): : C170 - C191
  • [26] Partitioning Models for Scaling Parallel Sparse Matrix-Matrix Multiplication
    Akbudak, Kadir
    Selvitopi, Oguz
    Aykanat, Cevdet
    ACM TRANSACTIONS ON PARALLEL COMPUTING, 2018, 4 (03)
  • [27] Sparse approximate matrix-matrix multiplication for density matrix purification with error control
    Artemov, Anton G.
    Rubensson, Emanuel H.
    JOURNAL OF COMPUTATIONAL PHYSICS, 2021, 438
  • [28] High-performance and Memory-saving Sparse General Matrix-Matrix Multiplication for NVIDIA Pascal GPU
    Nagasaka, Yusuke
    Nukada, Akira
    Matsuoka, Satoshi
    2017 46TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2017, : 101 - 110
  • [29] An Efficient Sparse Matrix Multiplication for skewed matrix on GPU
    Shah, Monika
    Patel, Vibha
    2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS), 2012, : 1301 - 1306
  • [30] Register-Aware Optimizations for Parallel Sparse Matrix-Matrix Multiplication
    Liu, Junhong
    He, Xin
    Liu, Weifeng
    Tan, Guangming
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2019, 47 (03) : 403 - 417