AdaQAT: Adaptive Bit-Width Quantization-Aware Training

被引：0

作者：

Gernigon, Cedric ^{[1
]}

Filip, Silviu-Ioan ^{[1
]}

Sentieys, Olivier ^{[1
]}

Coggiola, Clement ^{[2
]}

Bruno, Mickael ^{[2
]}

机构：

[1] Univ Rennes, INRIA, CNRS, IRISA, F-35000 Rennes, France

[2] CNES, Spacecraft Tech, Onboard Data Handling, Toulouse, France

来源：

2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024 | 2024年

关键词：

Neural Network Compression; Quantization Aware Training; Adaptive Bit-Width Optimization;

D O I：

10.1109/AICAS59952.2024.10595895

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large-scale deep neural networks (DNNs) have achieved remarkable success in many application scenarios. However, high computational complexity and energy costs of modern DNNs make their deployment on edge devices challenging. Model quantization is a common approach to deal with deployment constraints, but searching for optimized bit-widths can be challenging. In this work, we present Adaptive Bit-Width Quantization Aware Training (AdaQAT), a learning-based method that automatically optimizes weight and activation signal bit-widths during training for more efficient DNN inference. We use relaxed real-valued bit-widths that are updated using a gradient descent rule, but are otherwise discretized for all quantization operations. The result is a simple and flexible QAT approach for mixed-precision uniform quantization problems. Compared to other methods that are generally designed to be run on a pretrained network, AdaQAT works well in both training from scratch and fine-tuning scenarios. Initial results on the CIFAR-10 and ImageNet datasets using ResNet20 and ResNet18 models, respectively, indicate that our method is competitive with other state-of-the-art mixed-precision quantization approaches.

引用

页码：442 / 446

页数：5

共 50 条

[31] Mixed-precision quantization-aware training for photonic neural networks
Kirtas, Manos
Passalis, Nikolaos
Oikonomou, Athina
Moralis-Pegios, Miltos
Giamougiannis, George
Tsakyridis, Apostolos
Mourgias-Alexandris, George
Pleros, Nikolaos
Tefas, Anastasios
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (29): : 21361 - 21379
[32] Mixed-precision quantization-aware training for photonic neural networks
Manos Kirtas
Nikolaos Passalis
Athina Oikonomou
Miltos Moralis-Pegios
George Giamougiannis
Apostolos Tsakyridis
George Mourgias-Alexandris
Nikolaos Pleros
Anastasios Tefas
Neural Computing and Applications, 2023, 35 : 21361 - 21379
[33] Channel Pruning in Quantization-aware Training: an Adaptive Projection-gradient Descent-shrinkage-splitting Method
Li, Zhijian
Xin, Jack
2022 5TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE FOR INDUSTRIES, AI4I, 2022, : 31 - 34
[34] Adaptive bit-width compression for low-energy frame memory design
Moshnyaga, VG
SIPS 2001: IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS: DESIGN AND IMPLEMENTATION, 2001, : 185 - 192
[35] Quantization-Aware and Tensor-Compressed Training of Transformers for Natural Language Understanding
Yang, Zi
Choudhary, Samridhi
Kunzmann, Siegfried
Zhang, Zheng
INTERSPEECH 2023, 2023, : 3292 - 3296
[36] Quantization training with two-level bit width
Kang, Hansung
Lee, Yongjoo
Cho, Dongbin
Lee, Jaeyoung
Kang, Mincheal
Kim, Younghoon
Seo, Jiwon
2022 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2022,
[37] MXQN:Mixed quantization for reducing bit-width of weights and activations in deep convolutional neural networks
Chenglong Huang
Puguang Liu
Liang Fang
Applied Intelligence, 2021, 51 : 4561 - 4574
[38] MXQN:Mixed quantization for reducing bit-width of weights and activations in deep convolutional neural networks
Huang, Chenglong
Liu, Puguang
Fang, Liang
APPLIED INTELLIGENCE, 2021, 51 (07) : 4561 - 4574
[39] Bit-width aware high-level synthesis for digital signal processing systems
Le Gal, Bertrand
Andriamisaina, Caaliph
Casseau, Emmanuel
IEEE INTERNATIONAL SOC CONFERENCE, PROCEEDINGS, 2006, : 175 - +
[40] Phase-limited quantization-aware training for diffractive deep neural networks
Wang, Yu
Sha, Qi
Qi, Feng
APPLIED OPTICS, 2025, 64 (06) : 1413 - 1419

← 1 2 3 4 5 →