Open-Source SpMV Multiplication Hardware Accelerator for FPGA-Based HPC Systems

被引:0
|
作者
Mpakos, Panagiotis [1 ]
Tasou, Ioanna [1 ]
Alverti, Chloe [3 ]
Miliadis, Panagiotis [1 ]
Malakonakis, Pavlos [2 ]
Theodoropoulos, Dimitris [1 ]
Goumas, Georgios [1 ]
Pnevmatikatos, Dionisios N. [1 ]
Koziris, Nectarios [1 ]
机构
[1] Natl Tech Univ Athens, Comp Syst Lab, Athens, Greece
[2] Tech Univ Crete, Khania, Greece
[3] Univ Illinois, Champaign, IL USA
基金
欧盟地平线“2020”;
关键词
Open-Source; SpMV; Sparse Matrix; HLS;
D O I
10.1007/978-3-031-55673-9_2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Sparse Matrix Vector (SpMV) multiplication kernel is a key component of many high-performance computing applications, but at the same time one of the most challenging to optimize, primarily due to its low flop-per-byte ratio and irregular memory accesses. As such, modern FPGAs, combined with High-Bandwidth Memory (HBM) modules, are much better-suited to the memory-bound nature of this kernel, compared to general purpose CPUs. Current FPGA-based approaches on SpMV support only single-precision floating point arithmetic. Moreover, they target for highly-streamed implementations that, although enhance performance, facilitate custom matrix storage formats, which (i) can increase the matrix footprint up to 3x, and (ii) drop the burden of input matrix transformation to developers. Towards widening the spectrum of FPGA-supported floating point formats for sparse algebra, this paper presents a first set of effective optimizations for double-precision SpMV hardware kernels using High-Level Synthesis (HLS) tools on HBM-equipped FPGAs. Results show that our work can provide 52.4x on average better performance compared to state-of-practice SpMV double-precision multiplication implementations on FPGAs for applications with volatile matrices, and up to 5.1x better performance-per-Watt compared to server-class CPUs.
引用
收藏
页码:19 / 32
页数:14
相关论文
共 50 条
  • [31] An FPGA-based hardware abstraction of quantum computing systems
    Khalid, Madiha
    Mujahid, Umar
    Jafri, Atif
    Choi, Hongsik
    Muhammad, Najam ul Islam
    JOURNAL OF COMPUTATIONAL ELECTRONICS, 2021, 20 (05) : 2001 - 2018
  • [32] A benign hardware Trojan on FPGA-based embedded systems
    Department of Computer Science, University of California, Los Angeles , Los Angeles, CA, United States
    Proc. - Int. Conf. Field Programmable Logic Appl., FPL, (464-470):
  • [33] Agile and Open-Source Hardware
    Bao, Yungang
    Carlson, Trevor E.
    IEEE MICRO, 2020, 40 (04) : 6 - 8
  • [34] The Joys of Open-Source Hardware
    Davidson, Scott
    IEEE DESIGN & TEST, 2024, 41 (06) : 103 - 103
  • [35] An efficient hardware accelerator for NTT-based polynomial multiplication using FPGA
    Salarifard, Raziyeh
    Soleimany, Hadi
    JOURNAL OF CRYPTOGRAPHIC ENGINEERING, 2024, 14 (02) : 415 - 426
  • [36] An AES Tightly Coupled Hardware Accelerator in an FPGA-based Embedded Processor Core
    Irwansyah, Arif
    Nambiar, Vishnu P.
    Khalil-Hani, Mohamed
    2009 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND TECHNOLOGY, VOL II, PROCEEDINGS, 2009, : 521 - 525
  • [37] FPGA-Based Hardware Accelerator on Portable Equipment for EEG Signal Patterns Recognition
    Xie, Yu
    Majoros, Tamas
    Oniga, Stefan
    ELECTRONICS, 2022, 11 (15)
  • [38] Advantages and limitations of fully on-chip CNN FPGA-based hardware accelerator
    Dinelli, Gianmarco
    Meoni, Gabriele
    Rapuano, Emilio
    Fanucci, Luca
    2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [39] A Fast and Efficient FPGA-based Level Set Hardware Accelerator for Image Segmentation
    Liu Ye
    Xiao Jianbiao
    Wu Fei
    Chang Liang
    Zhou Jun
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (06) : 1525 - 1532
  • [40] An FPGA-Based Hardware Accelerator for Energy-Efficient Bitmap Index Creation
    Xuan-Thuan Nguyen
    Trong-Thuc Hoang
    Hong-Thu Nguyen
    Katsumi Inoue
    Cong-Kha Pham
    IEEE ACCESS, 2018, 6 : 16046 - 16059