Compression of Deep Neural Networks with Structured Sparse Ternary Coding

被引：0

作者：

Yoonho Boo

Wonyong Sung

机构：

[1] Seoul National University,School of Electrical Engineering, Neural Processing Research Center

来源：

Journal of Signal Processing Systems | 2019年 / 91卷

关键词：

Deep neural networks; Weight compression; Structured sparsity; Fixed-point quantization; Pruning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Deep neural networks (DNNs) contain large number of weights, and usually require many off-chip memory accesses for inference. Weight size compression is a major requirement for on-chip memory based implementation of DNNs, which not only increases inference speed but also reduces power consumption. We propose a weight compression method for deep neural networks by combining pruning and quantization. The proposed method allows weights to have values of + 1 or − 1 only at predetermined positions. Then, a look-up table stores all possible combinations of sub-vectors of weight matrices. Encoding and decoding structured sparse weights can be conducted easily with the table. This method not only allows multiplication-free DNN implementations but also compresses the weight storage by as much as x32 times more than that in floating-point networks and with only a tiny performance loss. Weight distribution normalization and gradual pruning techniques are applied to lower performance degradation. Experiments are conducted with fully connected DNNs and convolutional neural networks.

引用

页码：1009 / 1019

页数：10

共 50 条

[31] Structured Pruning of Deep Convolutional Neural Networks
Anwar, Sajid
Hwang, Kyuyeon
Sung, Wonyong
ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)
[32] Structured Compression of Convolutional Neural Networks for Specialized Tasks
Gabbay, Freddy
Salomon, Benjamin
Shomron, Gil
MATHEMATICS, 2022, 10 (19)
[33] Learning Structured Sparsity in Deep Neural Networks
Wen, Wei
Wu, Chunpeng
Wang, Yandan
Chen, Yiran
Li, Hai
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[34] Learning sparse deep neural networks using efficient structured projections on convex constraints for green AI
Barlaud, Michel
Guyard, Frederic
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1566 - 1573
[35] Structured information in sparse-code metric neural networks
Dominguez, David
Gonzalez, Mario
Rodriguez, Francisco B.
Serrano, Eduardo
Erichsen, R., Jr.
Theumann, W. K.
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2012, 391 (03) : 799 - 808
[36] A Review on Deep Neural Networks for ICD Coding
Teng, Fei
Liu, Yiming
Li, Tianrui
Zhang, Yi
Li, Shuangqing
Zhao, Yue
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 4357 - 4375
[37] Universal Source Coding of Deep Neural Networks
Basu, Sourya
Varshney, Lav R.
2017 DATA COMPRESSION CONFERENCE (DCC), 2017, : 310 - 319
[38] Improve Robustness of Deep Neural Networks by Coding
Huang, Kunping
Raviv, Netanel
Jain, Siddharth
Upadhyaya, Pulakesh
Bruck, Jehoshua
Siegel, Paul H.
Jiang, Anxiao
2020 INFORMATION THEORY AND APPLICATIONS WORKSHOP (ITA), 2020,
[39] TDSNN: From Deep Neural Networks to Deep Spike Neural Networks with Temporal-Coding
Zhang, Lei
Zhou, Shengyuan
Zhi, Tian
Du, Zidong
Chen, Yunji
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 1319 - 1326
[40] Toward Efficient Convolutional Neural Networks With Structured Ternary Patterns
Kyrkou, Christos
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 8

← 1 2 3 4 5 →