Open-Source SpMV Multiplication Hardware Accelerator for FPGA-Based HPC Systems

被引：0

作者：

Mpakos, Panagiotis ^{[1
]}

Tasou, Ioanna ^{[1
]}

Alverti, Chloe ^{[3
]}

Miliadis, Panagiotis ^{[1
]}

Malakonakis, Pavlos ^{[2
]}

Theodoropoulos, Dimitris ^{[1
]}

Goumas, Georgios ^{[1
]}

Pnevmatikatos, Dionisios N. ^{[1
]}

Koziris, Nectarios ^{[1
]}

机构：

[1] Natl Tech Univ Athens, Comp Syst Lab, Athens, Greece

[2] Tech Univ Crete, Khania, Greece

[3] Univ Illinois, Champaign, IL USA

来源：

APPLIED RECONFIGURABLE COMPUTING. ARCHITECTURES, TOOLS, AND APPLICATIONS, ARC 2024 | 2024年 / 14553卷

基金：

欧盟地平线“2020”;

关键词：

Open-Source; SpMV; Sparse Matrix; HLS;

D O I：

10.1007/978-3-031-55673-9_2

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Sparse Matrix Vector (SpMV) multiplication kernel is a key component of many high-performance computing applications, but at the same time one of the most challenging to optimize, primarily due to its low flop-per-byte ratio and irregular memory accesses. As such, modern FPGAs, combined with High-Bandwidth Memory (HBM) modules, are much better-suited to the memory-bound nature of this kernel, compared to general purpose CPUs. Current FPGA-based approaches on SpMV support only single-precision floating point arithmetic. Moreover, they target for highly-streamed implementations that, although enhance performance, facilitate custom matrix storage formats, which (i) can increase the matrix footprint up to 3x, and (ii) drop the burden of input matrix transformation to developers. Towards widening the spectrum of FPGA-supported floating point formats for sparse algebra, this paper presents a first set of effective optimizations for double-precision SpMV hardware kernels using High-Level Synthesis (HLS) tools on HBM-equipped FPGAs. Results show that our work can provide 52.4x on average better performance compared to state-of-practice SpMV double-precision multiplication implementations on FPGAs for applications with volatile matrices, and up to 5.1x better performance-per-Watt compared to server-class CPUs.

引用

页码：19 / 32

页数：14

共 50 条

[31] An FPGA-based hardware abstraction of quantum computing systems
Khalid, Madiha
Mujahid, Umar
Jafri, Atif
Choi, Hongsik
Muhammad, Najam ul Islam
JOURNAL OF COMPUTATIONAL ELECTRONICS, 2021, 20 (05) : 2001 - 2018
[32] A benign hardware Trojan on FPGA-based embedded systems
Department of Computer Science, University of California, Los Angeles , Los Angeles, CA, United States
Proc. - Int. Conf. Field Programmable Logic Appl., FPL, (464-470):
[33] Agile and Open-Source Hardware
Bao, Yungang
Carlson, Trevor E.
IEEE MICRO, 2020, 40 (04) : 6 - 8
[34] The Joys of Open-Source Hardware
Davidson, Scott
IEEE DESIGN & TEST, 2024, 41 (06) : 103 - 103
[35] An efficient hardware accelerator for NTT-based polynomial multiplication using FPGA
Salarifard, Raziyeh
Soleimany, Hadi
JOURNAL OF CRYPTOGRAPHIC ENGINEERING, 2024, 14 (02) : 415 - 426
[36] An AES Tightly Coupled Hardware Accelerator in an FPGA-based Embedded Processor Core
Irwansyah, Arif
Nambiar, Vishnu P.
Khalil-Hani, Mohamed
2009 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND TECHNOLOGY, VOL II, PROCEEDINGS, 2009, : 521 - 525
[37] FPGA-Based Hardware Accelerator on Portable Equipment for EEG Signal Patterns Recognition
Xie, Yu
Majoros, Tamas
Oniga, Stefan
ELECTRONICS, 2022, 11 (15)
[38] Advantages and limitations of fully on-chip CNN FPGA-based hardware accelerator
Dinelli, Gianmarco
Meoni, Gabriele
Rapuano, Emilio
Fanucci, Luca
2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
[39] A Fast and Efficient FPGA-based Level Set Hardware Accelerator for Image Segmentation
Liu Ye
Xiao Jianbiao
Wu Fei
Chang Liang
Zhou Jun
JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (06) : 1525 - 1532
[40] An FPGA-Based Hardware Accelerator for Energy-Efficient Bitmap Index Creation
Xuan-Thuan Nguyen
Trong-Thuc Hoang
Hong-Thu Nguyen
Katsumi Inoue
Cong-Kha Pham
IEEE ACCESS, 2018, 6 : 16046 - 16059

← 1 2 3 4 5 →