HISPE: High-Speed Configurable Floating-Point Multi-Precision Processing Element

被引：0

作者：

Tejas, B. N. ^{[1
]}

Bhatia, Rakshit ^{[1
]}

Rao, Madhav ^{[1
]}

机构：

[1] IIIT Bangalore, Bangalore, Karnataka, India

来源：

2024 25TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED 2024 | 2024年

关键词：

Floating Point (FP); Processing Element (PE); TensorFloat-32 (TF32); BrainFloat-16 (BF16); High-Performance Computing (HPC); Multiply-Accumulate (MAC);

D O I：

10.1109/ISQED60706.2024.10528733

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multiple precision modes are needed for a floating-point processing element (PE) because they provide flexibility in handling different types of numerical data with varying levels of precision and performance metrics. Performing high-precision floating-point operations has the benefits of producing highly precise and accurate results while allowing for a greater range of numerical representation. Conversely, low-precision operations offer faster computation speeds and lower power consumption. In this paper, we propose a configurable multi-precision processing element (PE) which supports Half Precision, Single Precision, Double Precision, BrainFloat-16 (BF-16) and TensorFloat-32 (TF-32). The design is realized using GPDK 45 nm technology and operated at 281.9 MHz clock frequency. The design was also implemented on Xilinx ZCU104 FPGA evaluation board. Compared with previous state-of-the-art (SOTA) multi-precision PEs, the proposed design supports two more floating point data formats namely BF-16 and TF-32. It achieves the best energy performance with 2368.91 GFLOPS/W and offers 63% improvement in operating frequency with comparable footprint and power metrics.

引用

页数：8

共 50 条

[31] High-precision floating-point arithmetic in scientific computation
Bailey, DH
COMPUTING IN SCIENCE & ENGINEERING, 2005, 7 (03) : 54 - 61
[32] On-line periodic self-testing of high-speed floating-point units in microprocessors
Xenoulis, G.
Psarakis, M.
Gizopoulos, D.
Paschalis, A.
DFT 2007: 22ND IEEE INTERNATIONAL SYMPOSIUM ON DEFECT AND FAULT-TOLERANCE IN VLSI SYSTEMS, PROCEEDINGS, 2007, : 379 - 387
[33] Design and implementation of high speed single precision floating-point radix-3 butterfly unit
Yu, Jiyang
Li, Yang
Huang, Dan
Long, Teng
Liu, Wei
Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2010, 31 (12): : 2675 - 2681
[34] Design of high-speed floating point multiplier
Siddamal, Saroja V.
Banakar, R. M.
Jinaga, B. C.
DELTA 2008: FOURTH IEEE INTERNATIONAL SYMPOSIUM ON ELECTRONIC DESIGN, TEST AND APPLICATIONS, PROCEEDINGS, 2008, : 285 - +
[35] High-Precision Anchored Accumulators for Reproducible Floating-Point Summation
Lutz, David R.
Hinds, Christopher N.
2017 IEEE 24TH SYMPOSIUM ON COMPUTER ARITHMETIC (ARITH), 2017, : 98 - 105
[36] High throughput compression of double-precision floating-point data
Burtscher, Martin
Ratanaworabhan, Paruj
DCC 2007: DATA COMPRESSION CONFERENCE, PROCEEDINGS, 2007, : 293 - +
[37] High-Precision Anchored Accumulators for Reproducible Floating-Point Summation
Burgess, Neil
Goodyer, Chris
Hinds, Christopher N.
Lutz, David R.
IEEE TRANSACTIONS ON COMPUTERS, 2019, 68 (07) : 967 - 978
[38] Design of a high-speed FPGA-based 32-bit floating-point FFT processor
Mou, Shengmei
Yang, Xiaodong
SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 1, PROCEEDINGS, 2007, : 84 - +
[39] Unified bit pattern for leading-zero anticipatory logic for high-speed floating-point addition
Sun, HP
Gao, ML
PROCEEDINGS OF THE 3RD IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, 2003, : 786 - 789
[40] An efficient multi-format low-precision floating-point multiplier
Kermani, Hadis Ahmadpour
Zarandi, Azadeh Alsadat Emrani
SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2024, 41

← 1 2 3 4 5 →