LV: Latency-Versatile Floating-Point Engine for High-Performance Deep Neural Networks

被引：1

作者：

Lo, Yun-Chen ^{[1
]}

Tsai, Yu-Chih ^{[1
]}

Liu, Ren-Shuo ^{[1
]}

机构：

[1] Natl Tsing Hua Univ, Dept Elect Engn, Hsinchu 300044, Taiwan

来源：

IEEE COMPUTER ARCHITECTURE LETTERS | 2023年 / 22卷 / 02期

关键词：

Index Terms-Approximate computation; floating point; latency-versatile architecture;

D O I：

10.1109/LCA.2023.3287096

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Computing latency is an important system metric for Deep Neural Networks (DNNs) accelerators. To reduce latency, this work proposes LV, a latency-versatile floating-point engine (FP-PE), which contains the following key contributions: 1) an approximate bit-versatile multiplier-and-accumulate (BV-MAC) unit with early shifter and 2) an on-demand fixed-point-to-floating-point conversion (FXP2FP) unit. The extensive experimental results show that LV outperforms baseline FP-PE and redundancy-aware FP-PE by up to 2.12x and 1.3x speedup using TSMC 40-nm technology, achieving comparable accuracy on the ImageNet classification tasks.

引用

页码：125 / 128

页数：4

共 50 条

[31] Mapping Floating-Point Kernels onto High Performance Reconfigurable Computers
Morris, Gerald R.
Abed, Khalid H.
JOURNAL OF COMPUTERS, 2013, 8 (04) : 859 - 873
[32] High performance floating-point unit with 116 bit wide divider
Gerwig, G
Wetter, H
Schwarz, EM
Haess, J
16TH IEEE SYMPOSIUM ON COMPUTER ARITHMETIC, PROCEEDINGS, 2003, : 87 - 94
[33] A High Performance Accelerator Design for Ultra-Long Point Floating-Point FFT
Wang D.
Shi S.
Wu T.
Liu L.
Tan H.
Hao Z.
Guo F.
Li H.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2021, 58 (06): : 1192 - 1203
[34] Floating-Point Quantization Analysis of Multi-Layer Perceptron Artificial Neural Networks
Al-Rikabi, Hussein
Renczes, Balazs
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2024, 96 (4-5): : 301 - 312
[35] ASIC Design of Nanoscale Artificial Neural Networks for Inference/Training by Floating-Point Arithmetic
Niknia, Farzad
Wang, Ziheng
Liu, Shanshan
Reviriego, Pedro
Louri, Ahmed
Lombardi, Fabrizio
IEEE TRANSACTIONS ON NANOTECHNOLOGY, 2024, 23 : 208 - 216
[36] Design of Low-Cost High-performance Floating-point Fused Multiply-Add with Reduced Power
Qi, Zichu
Guo, Qi
Zhang, Ge
Li, Xiangku
Hu, Weiwu
23RD INTERNATIONAL CONFERENCE ON VLSI DESIGN, 2010, : 206 - 211
[37] Design and Implementation of a High-performance 64-bit Floating-point Reciprocal and Square Root Reciprocal Unit
Feng, Chaochao
Li, Shaoqing
Zhang, Minxuan
2008 9TH INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED-CIRCUIT TECHNOLOGY, VOLS 1-4, 2008, : 1843 - 1846
[38] High Performance High-Precision Floating-Point Operations on FPGAs using OpenCL
Nakasato, Naohito
Daisaka, Hiroshi
Ishikawa, Tadashi
2018 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT 2018), 2018, : 265 - 268
[39] Reduced precision floating-point optimization for Deep Neural Network On-Device Learning on microcontrollers
Nadalini, Davide
Rusci, Manuele
Benini, Luca
Conti, Francesco
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 149 : 212 - 226
[40] Deciphering the Feature Representation of Deep Neural Networks for High-Performance AI
Islam, Md Tauhidul
Xing, Lei
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (08) : 5273 - 5287

← 1 2 3 4 5 →