EMPIRICAL ANALYSIS OF IEEE754, FIXED-POINT AND POSIT IN LOW PRECISION MACHINE LEARNING

被引：0

作者：

Ciocirlan, Stefan-Dan ^{[1
]}

Neacs, Teodor-Andrei ^{[1
]}

Rughinis, Razvan-Victor ^{[1
]}

机构：

[1] Univ Politehn Bucuresti, Dept Comp Sci, Bucharest, Romania

来源：

UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE | 2023年 / 85卷 / 03期

关键词：

Number representation systems; IEEE754; Machine Learning; Knowledge Distillation; Posit;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Deep neural networks have changed the current algorithms' results in applications such as object classification, image segmentation or nat-ural language processing. To increase their accuracy, they became more complex and more costly in terms of storage, computation time and en-ergy consumption. This paper attacks the problem of storage and presents the advantages of using different number representations as fixed-point and posit numbers for deep neural network inference. The deep neural networks were trained using the proposed framework Low Precision Machine Learn-ing (LPML) with 32-bit IEEE754. The storage was first optimized by the usage of knowledge distillation and then by modifying layer by layer the number representation together with the precision. The first significant re-sults were made by modifying the number representation of the network but keeping the same precision per layer. For a 2-layer network (2LayerNet) using 16-bit Posit, the accuracy is 93.45%, close to 93.47%, the accuracy for using 32-bit IEEE754. Using the 8-bit Posit decreases the accuracy by 1.29%, but at the same time, it reduces the storage space by 75%. The usage of fixed point representation showed a small tolerance in the number of bits used for the fractional part. Using a 4-4 bit fixed point (4 bits for the integer part and 4 bits for the fractional part) reduces the storage used by 75% but decreases accuracy as low as 67.21%. When at least 8 bits are used for the fractional part, the results are similar to the 32-bit IEEE754. To increase accuracy before reducing precision, knowledge distillation was used. A ResNet18 network gained an 0.87% increase in accuracy by using a ResNet34 as a professor. By changing the number representation sys-tem and precision per layer, the storage was reduced by 43.47%, and the accuracy decreased by 0.26%. In conclusion, with the usage of knowledge distillation and change of number representation and precision per layer, the Resnet18 network had 66.75% smaller storage space than the ResNet34 professor network by losing only 1.38% in accuracy.

引用

页码：13 / 24

页数：12

共 26 条

[21] Convergence analysis of projected fixed-point iteration on a low-rank matrix manifold
Kolesnikov, D. A.
Oseledets, I. V.
NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS, 2018, 25 (05)
[22] IEEE-754 Half-Precision Floating-Point Low-Latency Reciprocal Square Root IP-Core
Aguilera-Galicia, Cuauhtemoc R.
Longoria-Gandara, Omar
Guzman-Ramos, Oscar A.
Pizano-Escalante, Luis
Vazquez-Castillo, Javier
2018 IEEE 10TH LATIN-AMERICAN CONFERENCE ON COMMUNICATIONS (IEEE LATINCOM), 2018,
[23] Fixed-point analysis of the low-energy constants in the pion-nucleon chiral Lagrangian
Kim, Y
Myhrer, F
Kubodera, K
PROGRESS OF THEORETICAL PHYSICS, 2004, 112 (02): : 289 - 297
[24] A Two-Step Fixed-Point Proximity Algorithm for a Class of Non-differentiable Optimization Models in Machine Learning
Zheng Li
Guohui Song
Yuesheng Xu
Journal of Scientific Computing, 2019, 81 : 923 - 940
[25] A Two-Step Fixed-Point Proximity Algorithm for a Class of Non-differentiable Optimization Models in Machine Learning
Li, Zheng
Song, Guohui
Xu, Yuesheng
JOURNAL OF SCIENTIFIC COMPUTING, 2019, 81 (02) : 923 - 940
[26] FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs With Dynamic Fixed-Point Representation
Shawahna, Ahmad
Sait, Sadiq M.
El-Maleh, Aiman
Ahmad, Irfan
IEEE ACCESS, 2022, 10 : 30202 - 30231

← 1 2 3 →