Compression of Deep Neural Networks with Structured Sparse Ternary Coding

被引:0
|
作者
Yoonho Boo
Wonyong Sung
机构
[1] Seoul National University,School of Electrical Engineering, Neural Processing Research Center
来源
关键词
Deep neural networks; Weight compression; Structured sparsity; Fixed-point quantization; Pruning;
D O I
暂无
中图分类号
学科分类号
摘要
Deep neural networks (DNNs) contain large number of weights, and usually require many off-chip memory accesses for inference. Weight size compression is a major requirement for on-chip memory based implementation of DNNs, which not only increases inference speed but also reduces power consumption. We propose a weight compression method for deep neural networks by combining pruning and quantization. The proposed method allows weights to have values of + 1 or − 1 only at predetermined positions. Then, a look-up table stores all possible combinations of sub-vectors of weight matrices. Encoding and decoding structured sparse weights can be conducted easily with the table. This method not only allows multiplication-free DNN implementations but also compresses the weight storage by as much as x32 times more than that in floating-point networks and with only a tiny performance loss. Weight distribution normalization and gradual pruning techniques are applied to lower performance degradation. Experiments are conducted with fully connected DNNs and convolutional neural networks.
引用
收藏
页码:1009 / 1019
页数:10
相关论文
共 50 条
  • [1] Compression of Deep Neural Networks with Structured Sparse Ternary Coding
    Boo, Yoonho
    Sung, Wonyong
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2019, 91 (09): : 1009 - 1019
  • [2] Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations
    Boo, Yoonho
    Sung, Wonyong
    2017 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2017,
  • [3] Layerwise Sparse Coding for Pruned Deep Neural Networks with Extreme Compression Ratio
    Liu, Xiao
    Li, Wenbin
    Huo, Jing
    Yao, Lili
    Gao, Yang
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 4900 - 4907
  • [4] Deep Neural Network Structured Sparse Coding for Online Processing
    Zhao, Haoli
    Ding, Shuxue
    Li, Xiang
    Huang, Huakun
    IEEE ACCESS, 2018, 6 : 74778 - 74791
  • [5] Hierarchical Sparse Coding of Objects in Deep Convolutional Neural Networks
    Liu, Xingyu
    Zhen, Zonglei
    Liu, Jia
    FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2020, 14
  • [6] RadiX-Net: Structured Sparse Matrices for Deep Neural Networks
    Robinett, Ryan A.
    Kepner, Jeremy
    2019 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2019, : 268 - 274
  • [7] Video Anomaly Detection with Sparse Coding Inspired Deep Neural Networks
    Luo, Weixin
    Liu, Wen
    Lian, Dongze
    Tang, Jinhui
    Duan, Lixin
    Peng, Xi
    Gao, Shenghua
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (03) : 1070 - 1084
  • [8] Structured Compression of Deep Neural Networks with Debiased Elastic Group LASSO
    Oyedotun, Oyebade K.
    Aouada, Djamila
    Ottersten, Bjoern
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 2266 - 2275
  • [9] SUPERVISED DEEP SPARSE CODING NETWORKS
    Sun, Xiaoxia
    Nasrabadi, Nasser M.
    Tran, Trac D.
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 346 - 350
  • [10] BAYESIAN NEURAL NETWORKS FOR SPARSE CODING
    Kuzin, Danil
    Isupova, Olga
    Mihaylova, Lyudmila
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2992 - 2996