LV: Latency-Versatile Floating-Point Engine for High-Performance Deep Neural Networks

被引:1
|
作者
Lo, Yun-Chen [1 ]
Tsai, Yu-Chih [1 ]
Liu, Ren-Shuo [1 ]
机构
[1] Natl Tsing Hua Univ, Dept Elect Engn, Hsinchu 300044, Taiwan
关键词
Index Terms-Approximate computation; floating point; latency-versatile architecture;
D O I
10.1109/LCA.2023.3287096
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Computing latency is an important system metric for Deep Neural Networks (DNNs) accelerators. To reduce latency, this work proposes LV, a latency-versatile floating-point engine (FP-PE), which contains the following key contributions: 1) an approximate bit-versatile multiplier-and-accumulate (BV-MAC) unit with early shifter and 2) an on-demand fixed-point-to-floating-point conversion (FXP2FP) unit. The extensive experimental results show that LV outperforms baseline FP-PE and redundancy-aware FP-PE by up to 2.12x and 1.3x speedup using TSMC 40-nm technology, achieving comparable accuracy on the ImageNet classification tasks.
引用
收藏
页码:125 / 128
页数:4
相关论文
共 50 条
  • [41] Implementation of Vector Floating-point processing unit on FPGAs for high performance computing
    Chen, Shi
    Venkatesan, Ramachandran
    Gillard, Paul
    2008 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-4, 2008, : 840 - 844
  • [42] FPGA-Based Training of Convolutional Neural Networks With a Reduced Precision Floating-Point Library
    DiCecco, Roberto
    Sun, Lin
    Chow, Paul
    2017 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY (ICFPT), 2017, : 239 - 242
  • [43] LBFP: Logarithmic Block Floating Point Arithmetic for Deep Neural Networks
    Ni, Chao
    Lu, Jinming
    Lin, Jun
    Wang, Zhongfeng
    APCCAS 2020: PROCEEDINGS OF THE 2020 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS 2020), 2020, : 201 - 204
  • [44] Error Analysis in the Hardware Neural Networks Applications using Reduced Floating-point Numbers Representation
    Pietras, Marcin
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE OF NUMERICAL ANALYSIS AND APPLIED MATHEMATICS 2014 (ICNAAM-2014), 2015, 1648
  • [45] High-performance simulation of neural networks
    Rademacher, TJ
    Lumpp, JE
    1997 IEEE AEROSPACE CONFERENCE PROCEEDINGS, VOL 3, 1997, : 401 - 413
  • [46] Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation
    Meng, Qingyan
    Xiao, Mingqing
    Yan, Shen
    Wang, Yisen
    Lin, Zhouchen
    Luo, Zhi-Quan
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12434 - 12443
  • [47] High-Throughput Low-Latency Pipelined Divider for Single-Precision Floating-Point Numbers
    Lyu, Fei
    Xia, Yan
    Chen, Yuheng
    Wang, Yanxu
    Luo, Yuanyong
    Wang, Yu
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2022, 30 (04) : 544 - 548
  • [48] High-performance deep spiking neural networks with 0.3 spikes per neuron
    Stanojevic, Ana
    Wozniak, Stanislaw
    Bellec, Guillaume
    Cherubini, Giovanni
    Pantazi, Angeliki
    Gerstner, Wulfram
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [49] High-performance VLSI design for convolution layer of deep learning neural networks
    Zeng J.-L.
    Chen K.-H.
    Wang J.-Y.
    International Journal of Electrical Engineering, 2019, 26 (05): : 195 - 202
  • [50] Training Deep Neural Networks with 8-bit Floating Point Numbers
    Wang, Naigang
    Choi, Jungwook
    Brand, Daniel
    Chen, Chia-Yu
    Gopalakrishnan, Kailash
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31