Bit-Quantized-Net: An Effective Method for Compressing Deep Neural Networks

被引：0

作者：

Chunshan Li

Qing Du

Xiaofei Xu

Jinhui Zhu

Dianhui Chu

机构：

[1] Harbin Institute of Technology,Department of Computer Science and Technology

[2] South China University of Technology,School of Software Engineering

来源：

Mobile Networks and Applications | 2021年 / 26卷

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Deep neural networks have achieved state-of-the-art performances in wide range scenarios, such as natural language processing, object detection, image classification, speech recognition, etc. While showing impressive results across these machine learning tasks, neural network models still suffer from computational consuming and memory intensive for parameters training/storage on mobile service scenario. As a result, how to simplify models as well as accelerate neural networks are undoubtedly to be crucial research topic. To address this issue, in this paper, we propose “Bit-Quantized-Net”(BQ-Net), which can compress deep neural networks both at the training phase and testing inference. And, the model size can be reduced by compressing bit quantized weights. Specifically, for training or testing plain neural network model, it is running tens of millions of times of y=wx+b computations. In BQ-Net, however, model approximate the computation operation y = wx + b by y = sign(w)(x ≫|w|) + b during forward propagation of neural networks. That is, BQ-Net trains the networks with bit quantized weights during forwarding propagation, while retaining the full precision weights for gradients accumulating during backward propagation. Finally, we apply Huffman coding to encode the bit shifting weights which compressed the model size in some way. Extensive experiments on three real data-sets (MNIST, CIFAR-10, SVHN) show that BQ-Net can achieve 10-14× model compressibility.

引用

页码：104 / 113

页数：9

共 50 条

[1] Bit-Quantized-Net: An Effective Method for Compressing Deep Neural Networks
Li, Chunshan
Du, Qing
Xu, Xiaofei
Zhu, Jinhui
Chu, Dianhui
MOBILE NETWORKS & APPLICATIONS, 2021, 26 (01): : 104 - 113
[2] A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks
Long, Xin
Zeng, XiangRong
Ben, Zongcheng
Zhou, Dianle
Zhang, Maojun
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2020, 2020 (2020)
[3] Compressing Deep Neural Networks for Recognizing Places
Saha, Soham
Varma, Girish
Jawahar, C. V.
PROCEEDINGS 2017 4TH IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2017, : 352 - 357
[4] Anonymous Model Pruning for Compressing Deep Neural Networks
Zhang, Lechun
Chen, Guangyao
Shi, Yemin
Zhang, Quan
Tan, Mingkui
Wang, Yaowei
Tian, Yonghong
Huang, Tiejun
THIRD INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2020), 2020, : 161 - 164
[5] COMPRESSING DEEP NEURAL NETWORKS FOR EFFICIENT SPEECH ENHANCEMENT
Tan, Ke
Wang, DeLiang
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8358 - 8362
[6] CUP: Cluster Pruning for Compressing Deep Neural Networks
Duggal, Rahul
Xiao, Cao
Vuduc, Richard
Duen Horng Chau
Sun, Jimeng
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5102 - 5106
[7] Compressing Deep Neural Networks With Sparse Matrix Factorization
Wu, Kailun
Guo, Yiwen
Zhang, Changshui
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (10) : 3828 - 3838
[8] COMPRESSING DEEP NEURAL NETWORKS FOR EFFICIENT VISUAL INFERENCE
Ge, Shiming
Luo, Zhao
Zhao, Shengwei
Jin, Xin
Zhang, Xiao-Yu
2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 667 - 672
[9] Compressing deep neural networks by matrix product operators
Gao, Ze-Feng
Cheng, Song
He, Rong-Qiang
Xie, Z. Y.
Zhao, Hui-Hai
Lu, Zhong-Yi
Xiang, Tao
PHYSICAL REVIEW RESEARCH, 2020, 2 (02):
[10] Height quantized diffractive deep neural networks
Li, Runze
Zhuang, Xuhui
Ding, Gege
Song, Mingzhu
Jin, Guang
Zhang, Xuemin
Wen, Jie
Wang, Shaoju
PHYSICA SCRIPTA, 2025, 100 (03)

← 1 2 3 4 5 →