HISPE: High-Speed Configurable Floating-Point Multi-Precision Processing Element

被引：0

作者：

Tejas, B. N. ^{[1
]}

Bhatia, Rakshit ^{[1
]}

Rao, Madhav ^{[1
]}

机构：

[1] IIIT Bangalore, Bangalore, Karnataka, India

来源：

2024 25TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED 2024 | 2024年

关键词：

Floating Point (FP); Processing Element (PE); TensorFloat-32 (TF32); BrainFloat-16 (BF16); High-Performance Computing (HPC); Multiply-Accumulate (MAC);

D O I：

10.1109/ISQED60706.2024.10528733

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multiple precision modes are needed for a floating-point processing element (PE) because they provide flexibility in handling different types of numerical data with varying levels of precision and performance metrics. Performing high-precision floating-point operations has the benefits of producing highly precise and accurate results while allowing for a greater range of numerical representation. Conversely, low-precision operations offer faster computation speeds and lower power consumption. In this paper, we propose a configurable multi-precision processing element (PE) which supports Half Precision, Single Precision, Double Precision, BrainFloat-16 (BF-16) and TensorFloat-32 (TF-32). The design is realized using GPDK 45 nm technology and operated at 281.9 MHz clock frequency. The design was also implemented on Xilinx ZCU104 FPGA evaluation board. Compared with previous state-of-the-art (SOTA) multi-precision PEs, the proposed design supports two more floating point data formats namely BF-16 and TF-32. It achieves the best energy performance with 2368.91 GFLOPS/W and offers 63% improvement in operating frequency with comparable footprint and power metrics.

引用

页数：8

共 50 条

[21] Design, Implementation and On-Chip High-Speed Test of SFQ Half-Precision Floating-Point Multiplier
Hara, Hiroshi
Obata, Koji
Park, Heejoung
Yamanashi, Yuki
Taketomi, Kazuhiro
Yoshikawa, Nobuyuki
Tanaka, Masamitsu
Fujimaki, Akira
Takagi, N.
Takagi, Kazuyoshi
Nagasawa, S.
IEEE TRANSACTIONS ON APPLIED SUPERCONDUCTIVITY, 2009, 19 (03) : 657 - 660
[22] Low-latency High-throughput Multi-precision Fused Floating-point Division and Square-root Unit Design
Dai, Liangtao
Zhu, Haocheng
Yuan, Binzhe
Yang, Chao
Wang, Yuan
Lou, Xin
2024 INTERNATIONAL VLSI SYMPOSIUM ON TECHNOLOGY, SYSTEMS AND APPLICATIONS, VLSI TSA, 2024,
[23] Run-time reconfigurable multi-precision floating point multiplier design for high speed, low-power applications
Arish, S.
Sharma, R. K.
2ND INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN) 2015, 2015, : 902 - 907
[24] Testability analysis and scalable test generation for high-speed floating-point units
Xenoulis, George
Psarakis, Mihalis
Gizopoulos, Dimitris
Paschalis, Antonis
IEEE TRANSACTIONS ON COMPUTERS, 2006, 55 (11) : 1449 - U24
[25] FLOATING-POINT MU-P IMPLEMENTS HIGH-SPEED MATH FUNCTIONS
QUONG, D
EDN, 1986, 31 (03) : 143 - &
[26] Design and implementation of a GaAs systolic floating-point processing element
BeaumontSmith, A
Marwood, W
Lim, CC
Eshraghian, K
IEE PROCEEDINGS-COMPUTERS AND DIGITAL TECHNIQUES, 1996, 143 (05): : 325 - 330
[27] Design and Implementation of Accuracy Configurable Multi-Precision Multiplier Architecture for Signal Processing Applications
Ramya, R.
Moorthi, S.
2018 IEEE RECENT ADVANCES IN INTELLIGENT COMPUTATIONAL SYSTEMS (RAICS), 2018, : 89 - 93
[28] High-speed, area-efficient FPGA-based floating-point multiplier
Aty, GA
Hussein, AI
Ashour, IS
Mones, M
ICM 2003: PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON MICROELECTRONICS, 2003, : 274 - 277
[29] High-Speed Single Precision Floating Point Multiplier using CORDIC Algorithm
Yeshwanth, Balaji
Venkatesh, Vutukuri
Akhil, Repala
2018 3RD INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER, AND OPTIMIZATION TECHNIQUES (ICEECCOT - 2018), 2018, : 135 - 141
[30] Low-Latency Bit-Accurate Architecture for Configurable Precision Floating-Point Division
Xia, Jincheng
Fu, Wenjia
Liu, Ming
Wang, Mingjiang
APPLIED SCIENCES-BASEL, 2021, 11 (11):

← 1 2 3 4 5 →