Bit-Quantized-Net: An Effective Method for Compressing Deep Neural Networks

被引:0
|
作者
Chunshan Li
Qing Du
Xiaofei Xu
Jinhui Zhu
Dianhui Chu
机构
[1] Harbin Institute of Technology,Department of Computer Science and Technology
[2] South China University of Technology,School of Software Engineering
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Deep neural networks have achieved state-of-the-art performances in wide range scenarios, such as natural language processing, object detection, image classification, speech recognition, etc. While showing impressive results across these machine learning tasks, neural network models still suffer from computational consuming and memory intensive for parameters training/storage on mobile service scenario. As a result, how to simplify models as well as accelerate neural networks are undoubtedly to be crucial research topic. To address this issue, in this paper, we propose “Bit-Quantized-Net”(BQ-Net), which can compress deep neural networks both at the training phase and testing inference. And, the model size can be reduced by compressing bit quantized weights. Specifically, for training or testing plain neural network model, it is running tens of millions of times of y=wx+b computations. In BQ-Net, however, model approximate the computation operation y = wx + b by y = sign(w)(x ≫|w|) + b during forward propagation of neural networks. That is, BQ-Net trains the networks with bit quantized weights during forwarding propagation, while retaining the full precision weights for gradients accumulating during backward propagation. Finally, we apply Huffman coding to encode the bit shifting weights which compressed the model size in some way. Extensive experiments on three real data-sets (MNIST, CIFAR-10, SVHN) show that BQ-Net can achieve 10-14× model compressibility.
引用
收藏
页码:104 / 113
页数:9
相关论文
共 50 条
  • [31] A Learning Framework for n-Bit Quantized Neural Networks Toward FPGAs
    Chen, Jun
    Liu, Liang
    Liu, Yong
    Zeng, Xianfang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (03) : 1067 - 1081
  • [32] An MILP Encoding for Efficient Verification of Quantized Deep Neural Networks
    Mistry, Samvid
    Saha, Indranil
    Biswas, Swarnendu
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (11) : 4445 - 4456
  • [33] Training Quantized Deep Neural Networks via Cooperative Coevolution
    Peng, Fu
    Liu, Shengcai
    Lu, Ning
    Tang, Ke
    ADVANCES IN SWARM INTELLIGENCE, ICSI 2022, PT II, 2022, : 81 - 93
  • [34] A Hardware Accelerator Based on Quantized Weights for Deep Neural Networks
    Sreehari, R.
    Deepu, Vijayasenan
    Arulalan, M. R.
    EMERGING RESEARCH IN ELECTRONICS, COMPUTER SCIENCE AND TECHNOLOGY, ICERECT 2018, 2019, 545 : 1079 - 1091
  • [35] Hardware for Quantized Mixed-Precision Deep Neural Networks
    Rios, Andres
    Nava, Patricia
    PROCEEDINGS OF THE 2022 15TH IEEE DALLAS CIRCUITS AND SYSTEMS CONFERENCE (DCAS 2022), 2022,
  • [36] Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks
    Rhu, Minsoo
    O'Connor, Mike
    Chatterjee, Niladrish
    Pool, Jeff
    Kwon, Youngeun
    Keckler, Stephen W.
    2018 24TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2018, : 78 - 91
  • [37] Compressing fully connected layers of deep neural networks using permuted features
    Nagaraju, Dara
    Chandrachoodan, Nitin
    IET COMPUTERS AND DIGITAL TECHNIQUES, 2023, 17 (3-4): : 149 - 161
  • [38] An Effective Method for Yemeni License Plate Recognition Based on Deep Neural Networks
    Taleb, Hamdan
    Li, Zhipeng
    Yuan, Changan
    Wu, Hongjie
    Zhao, Xingming
    Ghanem, Fahd A.
    INTELLIGENT COMPUTING METHODOLOGIES, PT III, 2022, 13395 : 304 - 314
  • [39] Norm Loss: An efficient yet effective regularization method for deep neural networks
    Georgiou, Theodoros
    Schmitt, Sebastian
    Back, Thomas
    Chen, Wei
    Lew, Michael
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 8812 - 8818
  • [40] Quantized Magnetic Domain Wall Synapse for Efficient Deep Neural Networks
    Dhull, Seema
    Misba, Walid Al
    Nisar, Arshid
    Atulasimha, Jayasimha
    Kaushik, Brajesh Kumar
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (03) : 4996 - 5005