Compression of Deep Neural Networks with Structured Sparse Ternary Coding

被引：0

作者：

Yoonho Boo

Wonyong Sung

机构：

[1] Seoul National University,School of Electrical Engineering, Neural Processing Research Center

来源：

Journal of Signal Processing Systems | 2019年 / 91卷

关键词：

Deep neural networks; Weight compression; Structured sparsity; Fixed-point quantization; Pruning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Deep neural networks (DNNs) contain large number of weights, and usually require many off-chip memory accesses for inference. Weight size compression is a major requirement for on-chip memory based implementation of DNNs, which not only increases inference speed but also reduces power consumption. We propose a weight compression method for deep neural networks by combining pruning and quantization. The proposed method allows weights to have values of + 1 or − 1 only at predetermined positions. Then, a look-up table stores all possible combinations of sub-vectors of weight matrices. Encoding and decoding structured sparse weights can be conducted easily with the table. This method not only allows multiplication-free DNN implementations but also compresses the weight storage by as much as x32 times more than that in floating-point networks and with only a tiny performance loss. Weight distribution normalization and gradual pruning techniques are applied to lower performance degradation. Experiments are conducted with fully connected DNNs and convolutional neural networks.

引用

页码：1009 / 1019

页数：10

共 50 条

[21] Operator compression with deep neural networks
Fabian Kröpfl
Roland Maier
Daniel Peterseim
Advances in Continuous and Discrete Models, 2022
[22] Operator compression with deep neural networks
Kroepfl, Fabian
Maier, Roland
Peterseim, Daniel
ADVANCES IN CONTINUOUS AND DISCRETE MODELS, 2022, 2022 (01):
[23] Compression of Deep Neural Networks on the Fly
Soulie, Guillaume
Gripon, Vincent
Robert, Maelys
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II, 2016, 9887 : 153 - 160
[24] Supervised Deep Sparse Coding Networks for Image Classification
Sun, Xiaoxia
Nasrabadi, Nasser M.
Tran, Trac D.
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 405 - 418
[25] ADVERSARIAL ATTACKS ON DEEP UNFOLDED NETWORKS FOR SPARSE CODING
Wang, Yulu
Wu, Kailun
Zhang, Changshui
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 5974 - 5978
[26] Sparse Deep Neural Networks for Embedded Intelligence
Bi, Jia
Gunn, Steve R.
2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 30 - 38
[27] Learning Sparse Patterns in Deep Neural Networks
Wen, Weijing
Yang, Fan
Su, Yangfeng
Zhou, Dian
Zeng, Xuan
2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2019,
[28] Accelerating Sparse Deep Neural Networks on FPGAs
Huang, Sitao
Pearson, Carl
Nagi, Rakesh
Xiong, Jinjun
Chen, Deming
Hwu, Wen-mei
2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
[29] Sparse synthesis regularization with deep neural networks
Obmann, Daniel
Schwab, Johannes
Haltmeier, Markus
2019 13TH INTERNATIONAL CONFERENCE ON SAMPLING THEORY AND APPLICATIONS (SAMPTA), 2019,
[30] Group sparse regularization for deep neural networks
Scardapane, Simone
Comminiello, Danilo
Hussain, Amir
Uncini, Aurelio
NEUROCOMPUTING, 2017, 241 : 81 - 89

← 1 2 3 4 5 →