LV: Latency-Versatile Floating-Point Engine for High-Performance Deep Neural Networks

被引：1

作者：

Lo, Yun-Chen ^{[1
]}

Tsai, Yu-Chih ^{[1
]}

Liu, Ren-Shuo ^{[1
]}

机构：

[1] Natl Tsing Hua Univ, Dept Elect Engn, Hsinchu 300044, Taiwan

来源：

IEEE COMPUTER ARCHITECTURE LETTERS | 2023年 / 22卷 / 02期

关键词：

Index Terms-Approximate computation; floating point; latency-versatile architecture;

D O I：

10.1109/LCA.2023.3287096

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Computing latency is an important system metric for Deep Neural Networks (DNNs) accelerators. To reduce latency, this work proposes LV, a latency-versatile floating-point engine (FP-PE), which contains the following key contributions: 1) an approximate bit-versatile multiplier-and-accumulate (BV-MAC) unit with early shifter and 2) an on-demand fixed-point-to-floating-point conversion (FXP2FP) unit. The extensive experimental results show that LV outperforms baseline FP-PE and redundancy-aware FP-PE by up to 2.12x and 1.3x speedup using TSMC 40-nm technology, achieving comparable accuracy on the ImageNet classification tasks.

引用

页码：125 / 128

页数：4

共 50 条

[41] Implementation of Vector Floating-point processing unit on FPGAs for high performance computing
Chen, Shi
Venkatesan, Ramachandran
Gillard, Paul
2008 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-4, 2008, : 840 - 844
[42] FPGA-Based Training of Convolutional Neural Networks With a Reduced Precision Floating-Point Library
DiCecco, Roberto
Sun, Lin
Chow, Paul
2017 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY (ICFPT), 2017, : 239 - 242
[43] LBFP: Logarithmic Block Floating Point Arithmetic for Deep Neural Networks
Ni, Chao
Lu, Jinming
Lin, Jun
Wang, Zhongfeng
APCCAS 2020: PROCEEDINGS OF THE 2020 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS 2020), 2020, : 201 - 204
[44] Error Analysis in the Hardware Neural Networks Applications using Reduced Floating-point Numbers Representation
Pietras, Marcin
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE OF NUMERICAL ANALYSIS AND APPLIED MATHEMATICS 2014 (ICNAAM-2014), 2015, 1648
[45] High-performance simulation of neural networks
Rademacher, TJ
Lumpp, JE
1997 IEEE AEROSPACE CONFERENCE PROCEEDINGS, VOL 3, 1997, : 401 - 413
[46] Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation
Meng, Qingyan
Xiao, Mingqing
Yan, Shen
Wang, Yisen
Lin, Zhouchen
Luo, Zhi-Quan
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12434 - 12443
[47] High-Throughput Low-Latency Pipelined Divider for Single-Precision Floating-Point Numbers
Lyu, Fei
Xia, Yan
Chen, Yuheng
Wang, Yanxu
Luo, Yuanyong
Wang, Yu
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2022, 30 (04) : 544 - 548
[48] High-performance deep spiking neural networks with 0.3 spikes per neuron
Stanojevic, Ana
Wozniak, Stanislaw
Bellec, Guillaume
Cherubini, Giovanni
Pantazi, Angeliki
Gerstner, Wulfram
NATURE COMMUNICATIONS, 2024, 15 (01)
[49] High-performance VLSI design for convolution layer of deep learning neural networks
Zeng J.-L.
Chen K.-H.
Wang J.-Y.
International Journal of Electrical Engineering, 2019, 26 (05): : 195 - 202
[50] Training Deep Neural Networks with 8-bit Floating Point Numbers
Wang, Naigang
Choi, Jungwook
Brand, Daniel
Chen, Chia-Yu
Gopalakrishnan, Kailash
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31

← 1 2 3 4 5 →