Efficient GPU Implementation of Affine Index Permutations on Arrays

被引:0
|
作者
Bouverot-Dupuis, Mathis [1 ]
Sheeran, Mary [2 ]
机构
[1] ENS Paris, Paris, France
[2] Chalmers Univ, Gothenburg, Sweden
基金
瑞典研究理事会;
关键词
GPU; data-parallelism; functional languages; ALGORITHMS;
D O I
10.1145/3609024.3609411
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Optimal usage of the memory system is a key element of fast GPU algorithms. Unfortunately many common algorithms fail in this regard despite exhibiting great regularity in memory access patterns. In this paper we propose efficient kernels to permute the elements of an array. We handle a class of permutations known as Bit Matrix Multiply Complement (BMMC) permutations, for which we design kernels of speed comparable to that of a simple array copy. This is a first step towards implementing a set of array combinators based on these permutations.
引用
收藏
页码:15 / 28
页数:14
相关论文
共 50 条
  • [1] Energy Efficient Affine Register File for GPU Microarchitecture
    Wang, Shao-Chung
    Kan, Li-Chen
    Hwang, Yuan-Shin
    Lee, Jenq-Kuen
    PROCEEDINGS OF 45TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPPW 2016), 2016, : 52 - 58
  • [2] Excedances of affine permutations
    Clark, Eric
    Ehrenborg, Richard
    ADVANCES IN APPLIED MATHEMATICS, 2011, 46 (1-4) : 175 - 191
  • [3] Learning Sparse Matrix Row Permutations for Efficient SpMM on GPU Architectures
    Mehrabi, Atefeh
    Lee, Donghyuk
    Chatterjee, Niladrish
    Sorin, Daniel J.
    Lee, Benjamin C.
    O'Connor, Mike
    2021 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2021), 2021, : 48 - 58
  • [4] GPU Implementation of the Affine Transform for 3D Image Registration
    Crookes, Danny
    Boyle, Kevin
    Miller, Paul
    Gillan, Charles
    2009 13TH INTERNATIONAL MACHINE VISION AND IMAGE PROCESSING CONFERENCE, 2009, : 151 - 155
  • [5] Efficient Implementation of MrBayes on Multi-GPU
    Bao, Jie
    Xia, Hongju
    Zhou, Jianfu
    Liu, Xiaoguang
    Wang, Gang
    MOLECULAR BIOLOGY AND EVOLUTION, 2013, 30 (06) : 1471 - 1479
  • [6] An efficient implementation of the kernel affine projection algorithm
    Albu, Felix
    Coltuc, Dinu
    Rotaru, Marius
    Nishikawa, Kiyoshi
    2013 8TH INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS (ISPA), 2013, : 349 - +
  • [7] Efficient Implementation of the Affine Projection Sign Algorithm
    Ni, Jingen
    Li, Feng
    IEEE SIGNAL PROCESSING LETTERS, 2012, 19 (01) : 24 - 26
  • [8] Permutations of cubical arrays
    Wene, G. P.
    3QUANTUM: ALGEBRA GEOMETRY INFORMATION (QQQ CONFERENCE 2012), 2014, 532
  • [9] An Analysis of Permutations in Arrays
    Perrelle, Valentin
    Halbwachs, Nicolas
    VERIFICATION, MODEL CHECKING, AND ABSTRACT INTERPRETATION, PROCEEDINGS, 2010, 5944 : 279 - 294
  • [10] The enumeration of fully commutative affine permutations
    Hanusa, Christopher R. H.
    Jones, Brant C.
    EUROPEAN JOURNAL OF COMBINATORICS, 2010, 31 (05) : 1342 - 1359