Low-bit Quantization Needs Good Distribution

被引：3

作者：

Yu, Haibao ^{[1
]}

Wen, Tuopu ^{[2
]}

Cheng, Guangliang ^{[1
]}

Sun, Jiankai ^{[3
]}

Han, Qi ^{[1
]}

Shi, Jianping ^{[1
]}

机构：

[1] SenseTime Res, Beijing, Peoples R China

[2] Tsinghua Univ, Beijing, Peoples R China

[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China

来源：

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020) | 2020年

关键词：

D O I：

10.1109/CVPRW50498.2020.00348

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Low-bit quantization is challenging to maintain high performance with limited model capacity (e.g., 4-bit for both weights and activations). Naturally, the distribution of both weights and activations in deep neural network are Gaussian-like. Nevertheless, due to the limited bitwidth of low-bit model, uniform-like distributed weights and activations have been proved to be more friendly to quantization while preserving accuracy. Motivated by this, we propose Scale-Clip, a Distribution Reshaping technique that can reshape weights or activations into a uniform-like distribution in a dynamic manner. Furthermore, to increase the model capability for a low-bit model, a novel Group-based Quantization algorithm is proposed to split the filters into several groups. Different groups can learn different quantization parameters, which can be elegantly merged into batch normalization layer without extra computational cost in the inference stage. Finally, we integrate Scale-Clip technique with Group-based Quantization algorithm and propose the Group-based Distribution Reshaping Quantization (GDRQ) framework to further improve the quantization performance. Experiments on various networks (e.g. VGGNet and ResNet) and vision tasks (e.g. classification, detection, and segmentation) demonstrate that our framework achieves much better performance than state-of-the-art quantization methods. Specifically, the ResNet-50 model with 2-bit weights and 4-bit activations obtained by our framework achieves less than 1% accuracy drop on ImageNet classification task, which is a new state-of-the-art to our best knowledge.

引用

页码：2909 / 2918

页数：10

共 50 条

[1] Lightweight Periocular Recognition through Low-bit Quantization
Kolf, Jan Niklas
Boutros, Fadi
Kirchbuchner, Florian
Damer, Naser
2022 IEEE INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS (IJCB), 2022,
[2] Improving Extreme Low-Bit Quantization With Soft Threshold
Xu, Weixiang
Li, Fanrong
Jiang, Yingying
Yong, A.
He, Xiangyu
Wang, Peisong
Cheng, Jian
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (04) : 1549 - 1563
[3] Low-bit Quantization of Neural Networks for Efficient Inference
Choukroun, Yoni
Kravchik, Eli
Yang, Fan
Kisilev, Pavel
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3009 - 3018
[4] Low-Bit Quantization for Attributed Network Representation Learning
Yang, Hong
Pan, Shirui
Chen, Ling
Zhou, Chuan
Zhang, Peng
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 4047 - 4053
[5] Regularizing Activation Distribution for Ultra Low-bit Quantization-Aware Training of MobileNets
Park, Seongmin
Sung, Wonyong
Choi, Jungwook
2022 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2022, : 138 - 143
[6] Two-Step Quantization for Low-bit Neural Networks
Wang, Peisong
Hu, Qinghao
Zhang, Yifan
Zhang, Chunjie
Liu, Yang
Cheng, Jian
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4376 - 4384
[7] Learnable Companding Quantization for Accurate Low-bit Neural Networks
Yamamoto, Kohei
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5027 - 5036
[8] Distribution-aware Low-bit Quantization for 3D Point Cloud Networks
Hu, Dingchang
Chen, Siang
Yang, Huazhong
Wang, Guijin
2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
[9] Disentangled Loss for Low-Bit Quantization-Aware Training
Allenet, Thibault
Briand, David
Bichler, Olivier
Sentieys, Olivier
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2787 - 2791
[10] Stochastic Quantization for Learning Accurate Low-Bit Deep Neural Networks
Dong, Yinpeng
Ni, Renkun
Li, Jianguo
Chen, Yurong
Su, Hang
Zhu, Jun
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (11-12) : 1629 - 1642

← 1 2 3 4 5 →