AdaQAT: Adaptive Bit-Width Quantization-Aware Training

被引:0
|
作者
Gernigon, Cedric [1 ]
Filip, Silviu-Ioan [1 ]
Sentieys, Olivier [1 ]
Coggiola, Clement [2 ]
Bruno, Mickael [2 ]
机构
[1] Univ Rennes, INRIA, CNRS, IRISA, F-35000 Rennes, France
[2] CNES, Spacecraft Tech, Onboard Data Handling, Toulouse, France
关键词
Neural Network Compression; Quantization Aware Training; Adaptive Bit-Width Optimization;
D O I
10.1109/AICAS59952.2024.10595895
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large-scale deep neural networks (DNNs) have achieved remarkable success in many application scenarios. However, high computational complexity and energy costs of modern DNNs make their deployment on edge devices challenging. Model quantization is a common approach to deal with deployment constraints, but searching for optimized bit-widths can be challenging. In this work, we present Adaptive Bit-Width Quantization Aware Training (AdaQAT), a learning-based method that automatically optimizes weight and activation signal bit-widths during training for more efficient DNN inference. We use relaxed real-valued bit-widths that are updated using a gradient descent rule, but are otherwise discretized for all quantization operations. The result is a simple and flexible QAT approach for mixed-precision uniform quantization problems. Compared to other methods that are generally designed to be run on a pretrained network, AdaQAT works well in both training from scratch and fine-tuning scenarios. Initial results on the CIFAR-10 and ImageNet datasets using ResNet20 and ResNet18 models, respectively, indicate that our method is competitive with other state-of-the-art mixed-precision quantization approaches.
引用
收藏
页码:442 / 446
页数:5
相关论文
共 50 条
  • [1] Disentangled Loss for Low-Bit Quantization-Aware Training
    Allenet, Thibault
    Briand, David
    Bichler, Olivier
    Sentieys, Olivier
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2787 - 2791
  • [2] Overcoming Oscillations in Quantization-Aware Training
    Nagel, Markus
    Fournarakis, Marios
    Bondarenko, Yelysei
    Blankevoort, Tijmen
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [3] QUANTIZATION AND TRAINING OF LOW BIT-WIDTH CONVOLUTIONAL NEURAL NETWORKS FOR OBJECT DETECTION
    Yin, Penghang
    Zhang, Shuai
    Qi, Yingyong
    Xin, Jack
    JOURNAL OF COMPUTATIONAL MATHEMATICS, 2019, 37 (03) : 349 - 360
  • [4] Minimizing Energy of DNN Training with Adaptive Bit-width and Voltage Scaling
    Cheng, TaiYu
    Hashimoto, Masanori
    2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [5] Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks
    Tuan Hoang
    Thanh-Toan Do
    Nguyen, Tam, V
    Cheung, Ngai-Man
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 2111 - 2118
  • [6] Residual Quantization for Low Bit-Width Neural Networks
    Li, Zefan
    Ni, Bingbing
    Yang, Xiaokang
    Zhang, Wenjun
    Gao, Wen
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 214 - 227
  • [7] Overcoming Forgetting Catastrophe in Quantization-Aware Training
    Chen, Ting-An
    Yang, De-Nian
    Chen, Ming-Syan
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17312 - 17321
  • [8] Quantization-Aware Training With Dynamic and Static Pruning
    An, Sangho
    Shin, Jongyun
    Kim, Jangho
    IEEE ACCESS, 2025, 13 : 57476 - 57484
  • [9] Regularizing Activation Distribution for Ultra Low-bit Quantization-Aware Training of MobileNets
    Park, Seongmin
    Sung, Wonyong
    Choi, Jungwook
    2022 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2022, : 138 - 143
  • [10] PowerBit - Power aware arithmetic bit-width optimization
    Gaffar, Altaf Abdul
    Clarke, Jonathan A.
    Constantinides, George A.
    2006 IEEE INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY, PROCEEDINGS, 2006, : 289 - +