Optimized Fast Walsh-Hadamard Transform on OpenCL-GPU and OpenCL-CPU

被引:0
|
作者
Pereira, Pedro M. M. [1 ,2 ]
Domingues, Patricio [1 ,2 ]
Rodrigues, Nuno M. M. [1 ,2 ]
Faria, Sergio M. M. [1 ,2 ]
Falcao, Gabriel [2 ,3 ]
机构
[1] Polytech Inst Leiria, Sch Technol & Management, Leiria, Portugal
[2] Inst Telecomunicacoes, Lisbon, Portugal
[3] Univ Coimbra, Dept Elect & Comp Engn, P-3000 Coimbra, Portugal
关键词
Walsh-Hadamard Transform; Parallel Processing; OpenCL; SIMD; Image Processing Theory;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Walsh-Hadamard transform plays a major role in many image and video coding algorithms. In one hand, its intensive use in these algorithms makes its acceleration a challenge, in order to speed-up the algorithm execution. On the other hand, the available fast implementations are not efficient across different platforms. In this work, a parallel -based implementation of the WHT is proposed for CPU and GPU platforms using the OpenCL standard. OpenCL achieves portability at code level, but its performance suffers when the same code is used for CPUs and GPUs. To achieve top performance, we propose two WHT versions: OpenCL-GPU for GPUs and OpenCL-CPU for CPUs. Broadly, OpenCL-GPU executed on a GPU runs faster than OpenCL-CPU executed on a multicore CPU, with speedups that range from 120.87 to 101635. However, OpenCL-GPU performance drops substantially when ran on a multicore CPU machine, where OpenCL-CPU achieves higher performance, as it exploits the OpenCL support for SIMD instructions.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] CPU AND GPU CONSOLIDATION BASED ON OPENCL
    Bogdanov, A. V.
    Gankevich, I. G.
    Gaiduchok, V. Yu.
    Ko, Pyae Sone Ko
    DISTRIBUTED COMPUTING AND GRID-TECHNOLOGIES IN SCIENCE AND EDUCATION, 2012, : 66 - 70
  • [2] Fast slant transform algorithm based on the Walsh-Hadamard transform
    Glushkov Inst of Cybernetics, Kiev, Ukraine
    J Autom Inform Sci, 2 (1-11):
  • [3] Pauli decomposition via the fast Walsh-Hadamard transform
    Georges, Timothy N.
    Berntson, Bjorn K.
    Suenderhauf, Christoph
    Ivanov, Aleksei, V
    NEW JOURNAL OF PHYSICS, 2025, 27 (03):
  • [4] UNIFIED MATRIX TREATMENT OF FAST WALSH-HADAMARD TRANSFORM
    FINO, BJ
    ALGAZI, VR
    IEEE TRANSACTIONS ON COMPUTERS, 1976, 25 (11) : 1142 - 1146
  • [5] Optimized Fast Walsh-Hadamard Transform on GPUs for non-binary LDPC decoding
    Andrade, Joao
    Falcao, Gabriel
    Silva, Vitor
    PARALLEL COMPUTING, 2014, 40 (09) : 449 - 453
  • [6] FAST COMPUTATION OF DISCRETE HARTLEY TRANSFORM VIA WALSH-HADAMARD TRANSFORM
    HSU, CY
    WU, JL
    ELECTRONICS LETTERS, 1987, 23 (09) : 466 - 468
  • [7] Cache conscious Walsh-Hadamard Transform
    Park, N
    Prasanna, VK
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 1205 - 1208
  • [8] COMPUTATIONAL STRUCTURE FOR THE WALSH-HADAMARD TRANSFORM
    CHAIKIN, GM
    PROCEEDINGS OF THE SOCIETY OF PHOTO-OPTICAL INSTRUMENTATION ENGINEERS, 1982, 360 : 100 - 104
  • [9] OpenCL Kernel Fusion for GPU, Xeon Phi and CPU
    Filipovic, Jiri
    Benkner, Siegfried
    2015 27TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 2015, : 98 - 105
  • [10] Improvement and Application of Walsh-Hadamard Transform
    Wang, Sanfu
    Gao, Zhongshe
    PROCEEDINGS OF THE THIRD INTERNATIONAL WORKSHOP ON MATRIX ANALYSIS AND APPLICATIONS, VOL 2, 2009, : 363 - 366