Compression of Deep Neural Networks with Structured Sparse Ternary Coding

被引:0
|
作者
Yoonho Boo
Wonyong Sung
机构
[1] Seoul National University,School of Electrical Engineering, Neural Processing Research Center
来源
关键词
Deep neural networks; Weight compression; Structured sparsity; Fixed-point quantization; Pruning;
D O I
暂无
中图分类号
学科分类号
摘要
Deep neural networks (DNNs) contain large number of weights, and usually require many off-chip memory accesses for inference. Weight size compression is a major requirement for on-chip memory based implementation of DNNs, which not only increases inference speed but also reduces power consumption. We propose a weight compression method for deep neural networks by combining pruning and quantization. The proposed method allows weights to have values of + 1 or − 1 only at predetermined positions. Then, a look-up table stores all possible combinations of sub-vectors of weight matrices. Encoding and decoding structured sparse weights can be conducted easily with the table. This method not only allows multiplication-free DNN implementations but also compresses the weight storage by as much as x32 times more than that in floating-point networks and with only a tiny performance loss. Weight distribution normalization and gradual pruning techniques are applied to lower performance degradation. Experiments are conducted with fully connected DNNs and convolutional neural networks.
引用
收藏
页码:1009 / 1019
页数:10
相关论文
共 50 条
  • [31] Structured Pruning of Deep Convolutional Neural Networks
    Anwar, Sajid
    Hwang, Kyuyeon
    Sung, Wonyong
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)
  • [32] Structured Compression of Convolutional Neural Networks for Specialized Tasks
    Gabbay, Freddy
    Salomon, Benjamin
    Shomron, Gil
    MATHEMATICS, 2022, 10 (19)
  • [33] Learning Structured Sparsity in Deep Neural Networks
    Wen, Wei
    Wu, Chunpeng
    Wang, Yandan
    Chen, Yiran
    Li, Hai
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [34] Learning sparse deep neural networks using efficient structured projections on convex constraints for green AI
    Barlaud, Michel
    Guyard, Frederic
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1566 - 1573
  • [35] Structured information in sparse-code metric neural networks
    Dominguez, David
    Gonzalez, Mario
    Rodriguez, Francisco B.
    Serrano, Eduardo
    Erichsen, R., Jr.
    Theumann, W. K.
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2012, 391 (03) : 799 - 808
  • [36] A Review on Deep Neural Networks for ICD Coding
    Teng, Fei
    Liu, Yiming
    Li, Tianrui
    Zhang, Yi
    Li, Shuangqing
    Zhao, Yue
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 4357 - 4375
  • [37] Universal Source Coding of Deep Neural Networks
    Basu, Sourya
    Varshney, Lav R.
    2017 DATA COMPRESSION CONFERENCE (DCC), 2017, : 310 - 319
  • [38] Improve Robustness of Deep Neural Networks by Coding
    Huang, Kunping
    Raviv, Netanel
    Jain, Siddharth
    Upadhyaya, Pulakesh
    Bruck, Jehoshua
    Siegel, Paul H.
    Jiang, Anxiao
    2020 INFORMATION THEORY AND APPLICATIONS WORKSHOP (ITA), 2020,
  • [39] TDSNN: From Deep Neural Networks to Deep Spike Neural Networks with Temporal-Coding
    Zhang, Lei
    Zhou, Shengyuan
    Zhi, Tian
    Du, Zidong
    Chen, Yunji
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 1319 - 1326