LV: Latency-Versatile Floating-Point Engine for High-Performance Deep Neural Networks

被引:1
|
作者
Lo, Yun-Chen [1 ]
Tsai, Yu-Chih [1 ]
Liu, Ren-Shuo [1 ]
机构
[1] Natl Tsing Hua Univ, Dept Elect Engn, Hsinchu 300044, Taiwan
关键词
Index Terms-Approximate computation; floating point; latency-versatile architecture;
D O I
10.1109/LCA.2023.3287096
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Computing latency is an important system metric for Deep Neural Networks (DNNs) accelerators. To reduce latency, this work proposes LV, a latency-versatile floating-point engine (FP-PE), which contains the following key contributions: 1) an approximate bit-versatile multiplier-and-accumulate (BV-MAC) unit with early shifter and 2) an on-demand fixed-point-to-floating-point conversion (FXP2FP) unit. The extensive experimental results show that LV outperforms baseline FP-PE and redundancy-aware FP-PE by up to 2.12x and 1.3x speedup using TSMC 40-nm technology, achieving comparable accuracy on the ImageNet classification tasks.
引用
收藏
页码:125 / 128
页数:4
相关论文
共 50 条
  • [21] A HIGH-PERFORMANCE FLOATING POINT COPROCESSOR
    WOLRICH, G
    MCLELLAN, E
    HARADA, L
    MONTANARO, J
    YODLOWSKI, RAJ
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 1984, 19 (05) : 690 - 696
  • [22] A Reconfigurable Multiple-Precision Floating-Point Dot Product Unit for High-Performance Computing
    Mao, Wei
    Li, Kai
    Xie, Xinang
    Zhao, Shirui
    Li, He
    Yu, Hao
    PROCEEDINGS OF THE 2021 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2021), 2021, : 1793 - 1798
  • [23] Dynamically Adapting Floating-Point Precision to Accelerate Deep Neural Network Training
    Osorio Rios, John
    Armejach, Adria
    Petit, Eric
    Henry, Greg
    Casas, Marc
    20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 980 - 987
  • [24] A PIPELINED INTERFACE FOR HIGH FLOATING-POINT PERFORMANCE WITH PRECISE EXCEPTIONS
    IACOBOVICI, S
    IEEE MICRO, 1988, 8 (03) : 77 - 87
  • [25] High-performance, low-latency field-programmable gate array-based floating-point adder and multiplier units in a Virtex 4
    Karlstrom, P.
    Ehliar, A.
    Liu, D.
    IET COMPUTERS AND DIGITAL TECHNIQUES, 2008, 2 (04): : 305 - 313
  • [26] FPUx: High-Performance Floating-Point Support for Cost-Constrained RISC-V Cores
    Lin, Xian
    Liu, Heming
    Zheng, Xin
    Gao, Huaien
    Cai, Shuting
    Xiong, Xiaoming
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2024, 32 (10) : 1945 - 1949
  • [27] High-Performance Floating-Point VLSI Architecture of Lifting-Based Forward and Inverse Wavelet Transforms
    Guntoro, Andre
    Momeni, Massoud
    Keil, Hans-Peter
    Glesner, Manfred
    2008 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS 2008), VOLS 1-4, 2008, : 457 - 460
  • [28] Low-precision Floating-point Arithmetic for High-performance FPGA-based CNN Acceleration
    Wu, Chen
    Wang, Mingyu
    Chu, Xinyuan
    Wang, Kun
    He, Lei
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2022, 15 (01)
  • [29] Performance Analysis of Single-Precision Floating-Point MAC for Deep Learning
    Jha, Gunjan
    John, Eugene
    2018 IEEE 61ST INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2018, : 885 - 888
  • [30] Bandwidth Compression of Floating-Point Numerical Data Streams for FPGA-Based High-Performance Computing
    Ueno, Tomohiro
    Sano, Kentaro
    Yamamoto, Satoru
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2017, 10 (03)