Low-bit Quantization Needs Good Distribution

被引:3
|
作者
Yu, Haibao [1 ]
Wen, Tuopu [2 ]
Cheng, Guangliang [1 ]
Sun, Jiankai [3 ]
Han, Qi [1 ]
Shi, Jianping [1 ]
机构
[1] SenseTime Res, Beijing, Peoples R China
[2] Tsinghua Univ, Beijing, Peoples R China
[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020) | 2020年
关键词
D O I
10.1109/CVPRW50498.2020.00348
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Low-bit quantization is challenging to maintain high performance with limited model capacity (e.g., 4-bit for both weights and activations). Naturally, the distribution of both weights and activations in deep neural network are Gaussian-like. Nevertheless, due to the limited bitwidth of low-bit model, uniform-like distributed weights and activations have been proved to be more friendly to quantization while preserving accuracy. Motivated by this, we propose Scale-Clip, a Distribution Reshaping technique that can reshape weights or activations into a uniform-like distribution in a dynamic manner. Furthermore, to increase the model capability for a low-bit model, a novel Group-based Quantization algorithm is proposed to split the filters into several groups. Different groups can learn different quantization parameters, which can be elegantly merged into batch normalization layer without extra computational cost in the inference stage. Finally, we integrate Scale-Clip technique with Group-based Quantization algorithm and propose the Group-based Distribution Reshaping Quantization (GDRQ) framework to further improve the quantization performance. Experiments on various networks (e.g. VGGNet and ResNet) and vision tasks (e.g. classification, detection, and segmentation) demonstrate that our framework achieves much better performance than state-of-the-art quantization methods. Specifically, the ResNet-50 model with 2-bit weights and 4-bit activations obtained by our framework achieves less than 1% accuracy drop on ImageNet classification task, which is a new state-of-the-art to our best knowledge.
引用
收藏
页码:2909 / 2918
页数:10
相关论文
共 50 条
  • [1] Lightweight Periocular Recognition through Low-bit Quantization
    Kolf, Jan Niklas
    Boutros, Fadi
    Kirchbuchner, Florian
    Damer, Naser
    2022 IEEE INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS (IJCB), 2022,
  • [2] Improving Extreme Low-Bit Quantization With Soft Threshold
    Xu, Weixiang
    Li, Fanrong
    Jiang, Yingying
    Yong, A.
    He, Xiangyu
    Wang, Peisong
    Cheng, Jian
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (04) : 1549 - 1563
  • [3] Low-bit Quantization of Neural Networks for Efficient Inference
    Choukroun, Yoni
    Kravchik, Eli
    Yang, Fan
    Kisilev, Pavel
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3009 - 3018
  • [4] Low-Bit Quantization for Attributed Network Representation Learning
    Yang, Hong
    Pan, Shirui
    Chen, Ling
    Zhou, Chuan
    Zhang, Peng
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 4047 - 4053
  • [5] Regularizing Activation Distribution for Ultra Low-bit Quantization-Aware Training of MobileNets
    Park, Seongmin
    Sung, Wonyong
    Choi, Jungwook
    2022 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2022, : 138 - 143
  • [6] Two-Step Quantization for Low-bit Neural Networks
    Wang, Peisong
    Hu, Qinghao
    Zhang, Yifan
    Zhang, Chunjie
    Liu, Yang
    Cheng, Jian
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4376 - 4384
  • [7] Learnable Companding Quantization for Accurate Low-bit Neural Networks
    Yamamoto, Kohei
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5027 - 5036
  • [8] Distribution-aware Low-bit Quantization for 3D Point Cloud Networks
    Hu, Dingchang
    Chen, Siang
    Yang, Huazhong
    Wang, Guijin
    2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
  • [9] Disentangled Loss for Low-Bit Quantization-Aware Training
    Allenet, Thibault
    Briand, David
    Bichler, Olivier
    Sentieys, Olivier
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2787 - 2791
  • [10] Stochastic Quantization for Learning Accurate Low-Bit Deep Neural Networks
    Dong, Yinpeng
    Ni, Renkun
    Li, Jianguo
    Chen, Yurong
    Su, Hang
    Zhu, Jun
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (11-12) : 1629 - 1642