Bit-Quantized-Net: An Effective Method for Compressing Deep Neural Networks

被引:0
|
作者
Chunshan Li
Qing Du
Xiaofei Xu
Jinhui Zhu
Dianhui Chu
机构
[1] Harbin Institute of Technology,Department of Computer Science and Technology
[2] South China University of Technology,School of Software Engineering
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Deep neural networks have achieved state-of-the-art performances in wide range scenarios, such as natural language processing, object detection, image classification, speech recognition, etc. While showing impressive results across these machine learning tasks, neural network models still suffer from computational consuming and memory intensive for parameters training/storage on mobile service scenario. As a result, how to simplify models as well as accelerate neural networks are undoubtedly to be crucial research topic. To address this issue, in this paper, we propose “Bit-Quantized-Net”(BQ-Net), which can compress deep neural networks both at the training phase and testing inference. And, the model size can be reduced by compressing bit quantized weights. Specifically, for training or testing plain neural network model, it is running tens of millions of times of y=wx+b computations. In BQ-Net, however, model approximate the computation operation y = wx + b by y = sign(w)(x ≫|w|) + b during forward propagation of neural networks. That is, BQ-Net trains the networks with bit quantized weights during forwarding propagation, while retaining the full precision weights for gradients accumulating during backward propagation. Finally, we apply Huffman coding to encode the bit shifting weights which compressed the model size in some way. Extensive experiments on three real data-sets (MNIST, CIFAR-10, SVHN) show that BQ-Net can achieve 10-14× model compressibility.
引用
收藏
页码:104 / 113
页数:9
相关论文
共 50 条
  • [1] Bit-Quantized-Net: An Effective Method for Compressing Deep Neural Networks
    Li, Chunshan
    Du, Qing
    Xu, Xiaofei
    Zhu, Jinhui
    Chu, Dianhui
    MOBILE NETWORKS & APPLICATIONS, 2021, 26 (01): : 104 - 113
  • [2] A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks
    Long, Xin
    Zeng, XiangRong
    Ben, Zongcheng
    Zhou, Dianle
    Zhang, Maojun
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2020, 2020 (2020)
  • [3] Compressing Deep Neural Networks for Recognizing Places
    Saha, Soham
    Varma, Girish
    Jawahar, C. V.
    PROCEEDINGS 2017 4TH IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2017, : 352 - 357
  • [4] Anonymous Model Pruning for Compressing Deep Neural Networks
    Zhang, Lechun
    Chen, Guangyao
    Shi, Yemin
    Zhang, Quan
    Tan, Mingkui
    Wang, Yaowei
    Tian, Yonghong
    Huang, Tiejun
    THIRD INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2020), 2020, : 161 - 164
  • [5] COMPRESSING DEEP NEURAL NETWORKS FOR EFFICIENT SPEECH ENHANCEMENT
    Tan, Ke
    Wang, DeLiang
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8358 - 8362
  • [6] CUP: Cluster Pruning for Compressing Deep Neural Networks
    Duggal, Rahul
    Xiao, Cao
    Vuduc, Richard
    Duen Horng Chau
    Sun, Jimeng
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5102 - 5106
  • [7] Compressing Deep Neural Networks With Sparse Matrix Factorization
    Wu, Kailun
    Guo, Yiwen
    Zhang, Changshui
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (10) : 3828 - 3838
  • [8] COMPRESSING DEEP NEURAL NETWORKS FOR EFFICIENT VISUAL INFERENCE
    Ge, Shiming
    Luo, Zhao
    Zhao, Shengwei
    Jin, Xin
    Zhang, Xiao-Yu
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 667 - 672
  • [9] Compressing deep neural networks by matrix product operators
    Gao, Ze-Feng
    Cheng, Song
    He, Rong-Qiang
    Xie, Z. Y.
    Zhao, Hui-Hai
    Lu, Zhong-Yi
    Xiang, Tao
    PHYSICAL REVIEW RESEARCH, 2020, 2 (02):
  • [10] Height quantized diffractive deep neural networks
    Li, Runze
    Zhuang, Xuhui
    Ding, Gege
    Song, Mingzhu
    Jin, Guang
    Zhang, Xuemin
    Wen, Jie
    Wang, Shaoju
    PHYSICA SCRIPTA, 2025, 100 (03)