HISPE: High-Speed Configurable Floating-Point Multi-Precision Processing Element

被引:0
|
作者
Tejas, B. N. [1 ]
Bhatia, Rakshit [1 ]
Rao, Madhav [1 ]
机构
[1] IIIT Bangalore, Bangalore, Karnataka, India
关键词
Floating Point (FP); Processing Element (PE); TensorFloat-32 (TF32); BrainFloat-16 (BF16); High-Performance Computing (HPC); Multiply-Accumulate (MAC);
D O I
10.1109/ISQED60706.2024.10528733
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Multiple precision modes are needed for a floating-point processing element (PE) because they provide flexibility in handling different types of numerical data with varying levels of precision and performance metrics. Performing high-precision floating-point operations has the benefits of producing highly precise and accurate results while allowing for a greater range of numerical representation. Conversely, low-precision operations offer faster computation speeds and lower power consumption. In this paper, we propose a configurable multi-precision processing element (PE) which supports Half Precision, Single Precision, Double Precision, BrainFloat-16 (BF-16) and TensorFloat-32 (TF-32). The design is realized using GPDK 45 nm technology and operated at 281.9 MHz clock frequency. The design was also implemented on Xilinx ZCU104 FPGA evaluation board. Compared with previous state-of-the-art (SOTA) multi-precision PEs, the proposed design supports two more floating point data formats namely BF-16 and TF-32. It achieves the best energy performance with 2368.91 GFLOPS/W and offers 63% improvement in operating frequency with comparable footprint and power metrics.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] High-precision floating-point arithmetic in scientific computation
    Bailey, DH
    COMPUTING IN SCIENCE & ENGINEERING, 2005, 7 (03) : 54 - 61
  • [32] On-line periodic self-testing of high-speed floating-point units in microprocessors
    Xenoulis, G.
    Psarakis, M.
    Gizopoulos, D.
    Paschalis, A.
    DFT 2007: 22ND IEEE INTERNATIONAL SYMPOSIUM ON DEFECT AND FAULT-TOLERANCE IN VLSI SYSTEMS, PROCEEDINGS, 2007, : 379 - 387
  • [33] Design and implementation of high speed single precision floating-point radix-3 butterfly unit
    Yu, Jiyang
    Li, Yang
    Huang, Dan
    Long, Teng
    Liu, Wei
    Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2010, 31 (12): : 2675 - 2681
  • [34] Design of high-speed floating point multiplier
    Siddamal, Saroja V.
    Banakar, R. M.
    Jinaga, B. C.
    DELTA 2008: FOURTH IEEE INTERNATIONAL SYMPOSIUM ON ELECTRONIC DESIGN, TEST AND APPLICATIONS, PROCEEDINGS, 2008, : 285 - +
  • [35] High-Precision Anchored Accumulators for Reproducible Floating-Point Summation
    Lutz, David R.
    Hinds, Christopher N.
    2017 IEEE 24TH SYMPOSIUM ON COMPUTER ARITHMETIC (ARITH), 2017, : 98 - 105
  • [36] High throughput compression of double-precision floating-point data
    Burtscher, Martin
    Ratanaworabhan, Paruj
    DCC 2007: DATA COMPRESSION CONFERENCE, PROCEEDINGS, 2007, : 293 - +
  • [37] High-Precision Anchored Accumulators for Reproducible Floating-Point Summation
    Burgess, Neil
    Goodyer, Chris
    Hinds, Christopher N.
    Lutz, David R.
    IEEE TRANSACTIONS ON COMPUTERS, 2019, 68 (07) : 967 - 978
  • [38] Design of a high-speed FPGA-based 32-bit floating-point FFT processor
    Mou, Shengmei
    Yang, Xiaodong
    SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 1, PROCEEDINGS, 2007, : 84 - +
  • [39] Unified bit pattern for leading-zero anticipatory logic for high-speed floating-point addition
    Sun, HP
    Gao, ML
    PROCEEDINGS OF THE 3RD IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, 2003, : 786 - 789
  • [40] An efficient multi-format low-precision floating-point multiplier
    Kermani, Hadis Ahmadpour
    Zarandi, Azadeh Alsadat Emrani
    SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2024, 41