Optimal Architecture of Floating-Point Arithmetic for Neural Network Training Processors

被引：11

作者：

Junaid, Muhammad ^{[1
]}

Arslan, Saad ^{[2
]}

Lee, TaeGeon ^{[1
]}

Kim, HyungWon ^{[1
]}

机构：

[1] Chungbuk Natl Univ, Coll Elect & Comp Engn, Dept Elect, Cheongju 28644, South Korea

[2] COMSATS Univ Islamabad, Dept Elect & Comp Engn, Pk Rd, Islamabad 45550, Pakistan

来源：

SENSORS | 2022年 / 22卷 / 03期

基金：

新加坡国家研究基金会;

关键词：

floating-points; IEEE; 754; convolutional neural network (CNN); MNIST dataset; ACCELERATOR;

D O I：

10.3390/s22031230

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

The convergence of artificial intelligence (AI) is one of the critical technologies in the recent fourth industrial revolution. The AIoT (Artificial Intelligence Internet of Things) is expected to be a solution that aids rapid and secure data processing. While the success of AIoT demanded low-power neural network processors, most of the recent research has been focused on accelerator designs only for inference. The growing interest in self-supervised and semi-supervised learning now calls for processors offloading the training process in addition to the inference process. Incorporating training with high accuracy goals requires the use of floating-point operators. The higher precision floating-point arithmetic architectures in neural networks tend to consume a large area and energy. Consequently, an energy-efficient/compact accelerator is required. The proposed architecture incorporates training in 32 bits, 24 bits, 16 bits, and mixed precisions to find the optimal floating-point format for low power and smaller-sized edge device. The proposed accelerator engines have been verified on FPGA for both inference and training of the MNIST image dataset. The combination of 24-bit custom FP format with 16-bit Brain FP has achieved an accuracy of more than 93%. ASIC implementation of this optimized mixed-precision accelerator using TSMC 65nm reveals an active area of 1.036 x 1.036 mm(2) and energy consumption of 4.445 mu J per training of one image. Compared with 32-bit architecture, the size and the energy are reduced by 4.7 and 3.91 times, respectively. Therefore, the CNN structure using floating-point numbers with an optimized data path will significantly contribute to developing the AIoT field that requires a small area, low energy, and high accuracy.

引用

页数：16

共 50 条

[31] Unum: Adaptive Floating-Point Arithmetic
Morancho, Enric
19TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD 2016), 2016, : 651 - 656
[32] CORRECTION OF SUM IN FLOATING-POINT ARITHMETIC
PICHAT, M
NUMERISCHE MATHEMATIK, 1972, 19 (05) : 400 - &
[33] Binary floating-point arithmetic [1]
Zuras, Dan
Dr. Dobb's Journal, 2005, 30 (04):
[34] Floating-point arithmetic in the Coq system
Melquiond, Guillaume
INFORMATION AND COMPUTATION, 2012, 216 : 14 - 23
[35] NUMERICAL INVESTIGATION OF FLOATING-POINT ARITHMETIC
BAKHRAKH, SM
VELICHKO, SV
PILIPCHATIN, NE
SPIRIDONOV, VF
SUKHOV, EG
FEDOROVA, YG
KHEIFETS, VI
PROGRAMMING AND COMPUTER SOFTWARE, 1992, 18 (06) : 255 - 258
[36] A FLOATING-POINT RESIDUE ARITHMETIC UNIT
TAYLOR, FJ
HUANG, CH
JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 1981, 311 (01): : 33 - 53
[37] IEEE 754-2008 Decimal Floating-Point for Intel® Architecture Processors
Cornea, Marius
ARITH: 2009 19TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER ARITHMETIC, 2009, : 225 - 228
[38] Enabling In-Network Floating-Point Arithmetic for Efficient Computation Offloading
Cui, Penglai
Pan, Heng
Li, Zhenyu
Zhang, Penghao
Miao, Tianhao
Zhou, Jianer
Guan, Hongtao
Xie, Gaogang
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 4918 - 4934
[39] Secure Floating-Point Training
Rathee, Deevashwer
Bhattacharya, Anwesh
Gupta, Divya
Sharma, Rahul
Song, Dawn
PROCEEDINGS OF THE 32ND USENIX SECURITY SYMPOSIUM, 2023, : 6329 - 6346
[40] OPTIMAL ERROR-ESTIMATES FOR GAUSSIAN-ELIMINATION IN FLOATING-POINT ARITHMETIC
STUMMEL, F
ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1982, 62 (5BIS): : T355 - T357

← 1 2 3 4 5 →