AdaQAT: Adaptive Bit-Width Quantization-Aware Training

被引:0
|
作者
Gernigon, Cedric [1 ]
Filip, Silviu-Ioan [1 ]
Sentieys, Olivier [1 ]
Coggiola, Clement [2 ]
Bruno, Mickael [2 ]
机构
[1] Univ Rennes, INRIA, CNRS, IRISA, F-35000 Rennes, France
[2] CNES, Spacecraft Tech, Onboard Data Handling, Toulouse, France
关键词
Neural Network Compression; Quantization Aware Training; Adaptive Bit-Width Optimization;
D O I
10.1109/AICAS59952.2024.10595895
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large-scale deep neural networks (DNNs) have achieved remarkable success in many application scenarios. However, high computational complexity and energy costs of modern DNNs make their deployment on edge devices challenging. Model quantization is a common approach to deal with deployment constraints, but searching for optimized bit-widths can be challenging. In this work, we present Adaptive Bit-Width Quantization Aware Training (AdaQAT), a learning-based method that automatically optimizes weight and activation signal bit-widths during training for more efficient DNN inference. We use relaxed real-valued bit-widths that are updated using a gradient descent rule, but are otherwise discretized for all quantization operations. The result is a simple and flexible QAT approach for mixed-precision uniform quantization problems. Compared to other methods that are generally designed to be run on a pretrained network, AdaQAT works well in both training from scratch and fine-tuning scenarios. Initial results on the CIFAR-10 and ImageNet datasets using ResNet20 and ResNet18 models, respectively, indicate that our method is competitive with other state-of-the-art mixed-precision quantization approaches.
引用
收藏
页码:442 / 446
页数:5
相关论文
共 50 条
  • [31] Mixed-precision quantization-aware training for photonic neural networks
    Kirtas, Manos
    Passalis, Nikolaos
    Oikonomou, Athina
    Moralis-Pegios, Miltos
    Giamougiannis, George
    Tsakyridis, Apostolos
    Mourgias-Alexandris, George
    Pleros, Nikolaos
    Tefas, Anastasios
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (29): : 21361 - 21379
  • [32] Mixed-precision quantization-aware training for photonic neural networks
    Manos Kirtas
    Nikolaos Passalis
    Athina Oikonomou
    Miltos Moralis-Pegios
    George Giamougiannis
    Apostolos Tsakyridis
    George Mourgias-Alexandris
    Nikolaos Pleros
    Anastasios Tefas
    Neural Computing and Applications, 2023, 35 : 21361 - 21379
  • [33] Channel Pruning in Quantization-aware Training: an Adaptive Projection-gradient Descent-shrinkage-splitting Method
    Li, Zhijian
    Xin, Jack
    2022 5TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE FOR INDUSTRIES, AI4I, 2022, : 31 - 34
  • [34] Adaptive bit-width compression for low-energy frame memory design
    Moshnyaga, VG
    SIPS 2001: IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS: DESIGN AND IMPLEMENTATION, 2001, : 185 - 192
  • [35] Quantization-Aware and Tensor-Compressed Training of Transformers for Natural Language Understanding
    Yang, Zi
    Choudhary, Samridhi
    Kunzmann, Siegfried
    Zhang, Zheng
    INTERSPEECH 2023, 2023, : 3292 - 3296
  • [36] Quantization training with two-level bit width
    Kang, Hansung
    Lee, Yongjoo
    Cho, Dongbin
    Lee, Jaeyoung
    Kang, Mincheal
    Kim, Younghoon
    Seo, Jiwon
    2022 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2022,
  • [37] MXQN:Mixed quantization for reducing bit-width of weights and activations in deep convolutional neural networks
    Chenglong Huang
    Puguang Liu
    Liang Fang
    Applied Intelligence, 2021, 51 : 4561 - 4574
  • [38] MXQN:Mixed quantization for reducing bit-width of weights and activations in deep convolutional neural networks
    Huang, Chenglong
    Liu, Puguang
    Fang, Liang
    APPLIED INTELLIGENCE, 2021, 51 (07) : 4561 - 4574
  • [39] Bit-width aware high-level synthesis for digital signal processing systems
    Le Gal, Bertrand
    Andriamisaina, Caaliph
    Casseau, Emmanuel
    IEEE INTERNATIONAL SOC CONFERENCE, PROCEEDINGS, 2006, : 175 - +
  • [40] Phase-limited quantization-aware training for diffractive deep neural networks
    Wang, Yu
    Sha, Qi
    Qi, Feng
    APPLIED OPTICS, 2025, 64 (06) : 1413 - 1419