Bit-Quantized-Net: An Effective Method for Compressing Deep Neural Networks

被引:0
|
作者
Chunshan Li
Qing Du
Xiaofei Xu
Jinhui Zhu
Dianhui Chu
机构
[1] Harbin Institute of Technology,Department of Computer Science and Technology
[2] South China University of Technology,School of Software Engineering
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Deep neural networks have achieved state-of-the-art performances in wide range scenarios, such as natural language processing, object detection, image classification, speech recognition, etc. While showing impressive results across these machine learning tasks, neural network models still suffer from computational consuming and memory intensive for parameters training/storage on mobile service scenario. As a result, how to simplify models as well as accelerate neural networks are undoubtedly to be crucial research topic. To address this issue, in this paper, we propose “Bit-Quantized-Net”(BQ-Net), which can compress deep neural networks both at the training phase and testing inference. And, the model size can be reduced by compressing bit quantized weights. Specifically, for training or testing plain neural network model, it is running tens of millions of times of y=wx+b computations. In BQ-Net, however, model approximate the computation operation y = wx + b by y = sign(w)(x ≫|w|) + b during forward propagation of neural networks. That is, BQ-Net trains the networks with bit quantized weights during forwarding propagation, while retaining the full precision weights for gradients accumulating during backward propagation. Finally, we apply Huffman coding to encode the bit shifting weights which compressed the model size in some way. Extensive experiments on three real data-sets (MNIST, CIFAR-10, SVHN) show that BQ-Net can achieve 10-14× model compressibility.
引用
收藏
页码:104 / 113
页数:9
相关论文
共 50 条
  • [21] Bit Efficient Quantization for Deep Neural Networks
    Nayak, Prateeth
    Zhang, David
    Chai, Sek
    FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 52 - 56
  • [22] An Efficient Bit-Flip Resilience Optimization Method for Deep Neural Networks
    Schorn, Christoph
    Guntoro, Andre
    Ascheid, Gerd
    2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 1507 - 1512
  • [23] Small Is Beautiful: Compressing Deep Neural Networks for Partial Domain Adaptation
    Ma, Yuzhe
    Yao, Xufeng
    Chen, Ran
    Li, Ruiyu
    Shen, Xiaoyong
    Yu, Bei
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 3575 - 3585
  • [24] Compressing Deep Neural Networks using a Rank-Constrained Topology
    Nakkiran, Preetum
    Alvarez, Raziel
    Prabhavalkar, Rohit
    Parada, Carolina
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1473 - 1477
  • [25] Progressive principle component analysis for compressing deep convolutional neural networks
    Zhou, Jing
    Qi, Haobo
    Chen, Yu
    Wang, Hansheng
    NEUROCOMPUTING, 2021, 440 : 197 - 206
  • [26] Accelerating and Compressing Deep Neural Networks for Massive MIMO CSI Feedback
    Erak, Omar
    Abou-Zeid, Hatem
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 1029 - 1035
  • [27] Balanced Quantization: An Effective and Efficient Approach to Quantized Neural Networks
    Zhou, Shu-Chang
    Wang, Yu-Zhi
    Wen, He
    He, Qin-Yao
    Zou, Yu-Heng
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2017, 32 (04) : 667 - 682
  • [28] Balanced Quantization: An Effective and Efficient Approach to Quantized Neural Networks
    Shu-Chang Zhou
    Yu-Zhi Wang
    He Wen
    Qin-Yao He
    Yu-Heng Zou
    Journal of Computer Science and Technology, 2017, 32 : 667 - 682
  • [29] Compressing Deep Graph Neural Networks via Adversarial Knowledge Distillation
    He, Huarui
    Wang, Jie
    Zhang, Zhanqiu
    Wu, Feng
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 534 - 544
  • [30] A Knee-Guided Evolutionary Algorithm for Compressing Deep Neural Networks
    Zhou, Yao
    Yen, Gary G.
    Yi, Zhang
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (03) : 1626 - 1638