Bit-Quantized-Net: An Effective Method for Compressing Deep Neural Networks

被引：0

作者：

Chunshan Li

Qing Du

Xiaofei Xu

Jinhui Zhu

Dianhui Chu

机构：

[1] Harbin Institute of Technology,Department of Computer Science and Technology

[2] South China University of Technology,School of Software Engineering

来源：

Mobile Networks and Applications | 2021年 / 26卷

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Deep neural networks have achieved state-of-the-art performances in wide range scenarios, such as natural language processing, object detection, image classification, speech recognition, etc. While showing impressive results across these machine learning tasks, neural network models still suffer from computational consuming and memory intensive for parameters training/storage on mobile service scenario. As a result, how to simplify models as well as accelerate neural networks are undoubtedly to be crucial research topic. To address this issue, in this paper, we propose “Bit-Quantized-Net”(BQ-Net), which can compress deep neural networks both at the training phase and testing inference. And, the model size can be reduced by compressing bit quantized weights. Specifically, for training or testing plain neural network model, it is running tens of millions of times of y=wx+b computations. In BQ-Net, however, model approximate the computation operation y = wx + b by y = sign(w)(x ≫|w|) + b during forward propagation of neural networks. That is, BQ-Net trains the networks with bit quantized weights during forwarding propagation, while retaining the full precision weights for gradients accumulating during backward propagation. Finally, we apply Huffman coding to encode the bit shifting weights which compressed the model size in some way. Extensive experiments on three real data-sets (MNIST, CIFAR-10, SVHN) show that BQ-Net can achieve 10-14× model compressibility.

引用

页码：104 / 113

页数：9

共 50 条

[31] A Learning Framework for n-Bit Quantized Neural Networks Toward FPGAs
Chen, Jun
Liu, Liang
Liu, Yong
Zeng, Xianfang
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (03) : 1067 - 1081
[32] An MILP Encoding for Efficient Verification of Quantized Deep Neural Networks
Mistry, Samvid
Saha, Indranil
Biswas, Swarnendu
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (11) : 4445 - 4456
[33] Training Quantized Deep Neural Networks via Cooperative Coevolution
Peng, Fu
Liu, Shengcai
Lu, Ning
Tang, Ke
ADVANCES IN SWARM INTELLIGENCE, ICSI 2022, PT II, 2022, : 81 - 93
[34] A Hardware Accelerator Based on Quantized Weights for Deep Neural Networks
Sreehari, R.
Deepu, Vijayasenan
Arulalan, M. R.
EMERGING RESEARCH IN ELECTRONICS, COMPUTER SCIENCE AND TECHNOLOGY, ICERECT 2018, 2019, 545 : 1079 - 1091
[35] Hardware for Quantized Mixed-Precision Deep Neural Networks
Rios, Andres
Nava, Patricia
PROCEEDINGS OF THE 2022 15TH IEEE DALLAS CIRCUITS AND SYSTEMS CONFERENCE (DCAS 2022), 2022,
[36] Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks
Rhu, Minsoo
O'Connor, Mike
Chatterjee, Niladrish
Pool, Jeff
Kwon, Youngeun
Keckler, Stephen W.
2018 24TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2018, : 78 - 91
[37] Compressing fully connected layers of deep neural networks using permuted features
Nagaraju, Dara
Chandrachoodan, Nitin
IET COMPUTERS AND DIGITAL TECHNIQUES, 2023, 17 (3-4): : 149 - 161
[38] An Effective Method for Yemeni License Plate Recognition Based on Deep Neural Networks
Taleb, Hamdan
Li, Zhipeng
Yuan, Changan
Wu, Hongjie
Zhao, Xingming
Ghanem, Fahd A.
INTELLIGENT COMPUTING METHODOLOGIES, PT III, 2022, 13395 : 304 - 314
[39] Norm Loss: An efficient yet effective regularization method for deep neural networks
Georgiou, Theodoros
Schmitt, Sebastian
Back, Thomas
Chen, Wei
Lew, Michael
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 8812 - 8818
[40] Quantized Magnetic Domain Wall Synapse for Efficient Deep Neural Networks
Dhull, Seema
Misba, Walid Al
Nisar, Arshid
Atulasimha, Jayasimha
Kaushik, Brajesh Kumar
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (03) : 4996 - 5005

← 1 2 3 4 5 →