Compression of Deep Neural Networks with Structured Sparse Ternary Coding

被引:0
|
作者
Yoonho Boo
Wonyong Sung
机构
[1] Seoul National University,School of Electrical Engineering, Neural Processing Research Center
来源
关键词
Deep neural networks; Weight compression; Structured sparsity; Fixed-point quantization; Pruning;
D O I
暂无
中图分类号
学科分类号
摘要
Deep neural networks (DNNs) contain large number of weights, and usually require many off-chip memory accesses for inference. Weight size compression is a major requirement for on-chip memory based implementation of DNNs, which not only increases inference speed but also reduces power consumption. We propose a weight compression method for deep neural networks by combining pruning and quantization. The proposed method allows weights to have values of + 1 or − 1 only at predetermined positions. Then, a look-up table stores all possible combinations of sub-vectors of weight matrices. Encoding and decoding structured sparse weights can be conducted easily with the table. This method not only allows multiplication-free DNN implementations but also compresses the weight storage by as much as x32 times more than that in floating-point networks and with only a tiny performance loss. Weight distribution normalization and gradual pruning techniques are applied to lower performance degradation. Experiments are conducted with fully connected DNNs and convolutional neural networks.
引用
收藏
页码:1009 / 1019
页数:10
相关论文
共 50 条
  • [21] Operator compression with deep neural networks
    Fabian Kröpfl
    Roland Maier
    Daniel Peterseim
    Advances in Continuous and Discrete Models, 2022
  • [22] Operator compression with deep neural networks
    Kroepfl, Fabian
    Maier, Roland
    Peterseim, Daniel
    ADVANCES IN CONTINUOUS AND DISCRETE MODELS, 2022, 2022 (01):
  • [23] Compression of Deep Neural Networks on the Fly
    Soulie, Guillaume
    Gripon, Vincent
    Robert, Maelys
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II, 2016, 9887 : 153 - 160
  • [24] Supervised Deep Sparse Coding Networks for Image Classification
    Sun, Xiaoxia
    Nasrabadi, Nasser M.
    Tran, Trac D.
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 405 - 418
  • [25] ADVERSARIAL ATTACKS ON DEEP UNFOLDED NETWORKS FOR SPARSE CODING
    Wang, Yulu
    Wu, Kailun
    Zhang, Changshui
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 5974 - 5978
  • [26] Sparse Deep Neural Networks for Embedded Intelligence
    Bi, Jia
    Gunn, Steve R.
    2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 30 - 38
  • [27] Learning Sparse Patterns in Deep Neural Networks
    Wen, Weijing
    Yang, Fan
    Su, Yangfeng
    Zhou, Dian
    Zeng, Xuan
    2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2019,
  • [28] Accelerating Sparse Deep Neural Networks on FPGAs
    Huang, Sitao
    Pearson, Carl
    Nagi, Rakesh
    Xiong, Jinjun
    Chen, Deming
    Hwu, Wen-mei
    2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
  • [29] Sparse synthesis regularization with deep neural networks
    Obmann, Daniel
    Schwab, Johannes
    Haltmeier, Markus
    2019 13TH INTERNATIONAL CONFERENCE ON SAMPLING THEORY AND APPLICATIONS (SAMPTA), 2019,
  • [30] Group sparse regularization for deep neural networks
    Scardapane, Simone
    Comminiello, Danilo
    Hussain, Amir
    Uncini, Aurelio
    NEUROCOMPUTING, 2017, 241 : 81 - 89