Exploring Data Layout for Sparse Tensor Times Dense Matrix on GPUs

被引:1
|
作者
Ahmad, Khalid [1 ,3 ]
Cecka, Cris [2 ,4 ]
Garland, Michael [2 ,4 ]
Hall, Mary [1 ,3 ]
机构
[1] Univ Utah, Salt Lake City, UT USA
[2] NVIDIA Corp, Santa Clara, CA USA
[3] Univ Utah, Salt Lake City, UT 84108 USA
[4] NVIDIA Corp, Santa Clara, CA 95051 USA
关键词
Sparse tensors; SpMM; data layout;
D O I
10.1145/3633462
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
An important sparse tensor computation is sparse-tensor-dense-matrix multiplication (SpTM), which is used in tensor decomposition and applications. SpTMis a multi-dimensional analog to sparse-matrix-dense-matrix multiplication (SpMM). In this article, we employ a hierarchical tensor data layout that can unfold a multidimensional tensor to derive a 2D matrix, making it possible to compute SpTM using SpMM kernel implementations for GPUs. We compare two SpMM implementations to the state-of-the-art PASTA sparse tensor contraction implementation using: (1) SpMM with hierarchical tensor data layout; and, (2) unfolding followed by an invocation of cuSPARSE's SpMM. Results show that SpMM can outperform PASTA 70.9% of the time, but none of the three approaches is best overall. Therefore, we use a decision tree classifier to identify the best performing sparse tensor contraction kernel based on precomputed properties of the sparse tensor.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] Distributed-Memory Parallel Algorithms for Sparse Times Tall-Skinny-Dense Matrix Multiplication
    Selvitopi, Oguz
    Brock, Benjamin
    Nisa, Israt
    Tripathy, Alok
    Yelick, Katherine
    Buluc, Aydin
    PROCEEDINGS OF THE 2021 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2021, 2021, : 431 - 442
  • [42] A DENSE GATE MATRIX LAYOUT METHOD FOR MOS VLSI
    LOPEZ, AD
    LAW, HFS
    IEEE TRANSACTIONS ON ELECTRON DEVICES, 1980, 27 (08) : 1671 - 1675
  • [43] A DENSE GATE MATRIX LAYOUT METHOD FOR MOS VLSI
    LOPEZ, AD
    LAW, HFS
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 1980, 15 (04) : 736 - 740
  • [44] The I/O Complexity of Sparse Matrix Dense Matrix Multiplication
    Greiner, Gero
    Jacob, Riko
    LATIN 2010: THEORETICAL INFORMATICS, 2010, 6034 : 143 - 156
  • [45] Register-based Implementation of the Sparse General Matrix-Matrix Multiplication on GPUs
    Liu, Junhong
    He, Xin
    Liu, Weifeng
    Tan, Guangming
    ACM SIGPLAN NOTICES, 2018, 53 (01) : 407 - 408
  • [46] TileSpGEMM: A Tiled Algorithm for Parallel Sparse General Matrix-Matrix Multiplication on GPUs
    Niu, Yuyao
    Lu, Zhengyang
    Ji, Haonan
    Song, Shuhui
    Jin, Zhou
    Liu, Weifeng
    PPOPP'22: PROCEEDINGS OF THE 27TH ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, 2022, : 90 - 106
  • [47] Matrix and tensor completion using tensor ring decomposition with sparse representation
    Asante-Mensah, Maame G.
    Ahmadi-Asl, Salman
    Cichocki, Andrzej
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2021, 2 (03):
  • [48] Impact of Tensor Cores and Mixed Precision on the Reliability of Matrix Multiplication in GPUs
    Basso, Pedro Martins
    dos Santos, Fernando Fernandes
    Rech, Paolo
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2020, 67 (07) : 1560 - 1565
  • [49] Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications
    Ashari, Arash
    Sedaghati, Naser
    Eisenlohr, John
    Parthasarathy, Srinivasan
    Sadayappan, P.
    SC14: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2014, : 781 - 792
  • [50] Characterizing Dataset Dependence for Sparse Matrix-Vector Multiplication on GPUs
    Sedaghati, Naser
    Ashari, Arash
    Pouchet, Louis-Noel
    Parthasarathy, Srinivasan
    Sadayappan, P.
    2ND WORKSHOP ON PARALLEL PROGRAMMING FOR ANALYTICS APPLICATIONS (PPAA 2015), 2015, : 17 - 24