AdaQAT: Adaptive Bit-Width Quantization-Aware Training

被引：0

作者：

Gernigon, Cedric ^{[1
]}

Filip, Silviu-Ioan ^{[1
]}

Sentieys, Olivier ^{[1
]}

Coggiola, Clement ^{[2
]}

Bruno, Mickael ^{[2
]}

机构：

[1] Univ Rennes, INRIA, CNRS, IRISA, F-35000 Rennes, France

[2] CNES, Spacecraft Tech, Onboard Data Handling, Toulouse, France

来源：

2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024 | 2024年

关键词：

Neural Network Compression; Quantization Aware Training; Adaptive Bit-Width Optimization;

D O I：

10.1109/AICAS59952.2024.10595895

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large-scale deep neural networks (DNNs) have achieved remarkable success in many application scenarios. However, high computational complexity and energy costs of modern DNNs make their deployment on edge devices challenging. Model quantization is a common approach to deal with deployment constraints, but searching for optimized bit-widths can be challenging. In this work, we present Adaptive Bit-Width Quantization Aware Training (AdaQAT), a learning-based method that automatically optimizes weight and activation signal bit-widths during training for more efficient DNN inference. We use relaxed real-valued bit-widths that are updated using a gradient descent rule, but are otherwise discretized for all quantization operations. The result is a simple and flexible QAT approach for mixed-precision uniform quantization problems. Compared to other methods that are generally designed to be run on a pretrained network, AdaQAT works well in both training from scratch and fine-tuning scenarios. Initial results on the CIFAR-10 and ImageNet datasets using ResNet20 and ResNet18 models, respectively, indicate that our method is competitive with other state-of-the-art mixed-precision quantization approaches.

引用

页码：442 / 446

页数：5

共 50 条

[41] QUANTIZATION-AWARE PARAMETER ESTIMATION FOR AUDIO UPMIXING
Rohlfing, Christian
Liutkus, Antoine
Becker, Julian M.
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 746 - 750
[42] Quantization-Aware Pruning Criterion for Industrial Applications
Gil, Yoonhee
Park, Jong-Hyeok
Baek, Jongchan
Han, Soohee
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2022, 69 (03) : 3203 - 3213
[43] MBQuant: A novel multi-branch topology method for arbitrary bit-width network quantization
Zhong, Yunshan
Zhou, Yuyao
Chao, Fei
Ji, Rongrong
PATTERN RECOGNITION, 2025, 158
[44] Compiling for Reduced Bit-Width Queue Processors
Arquimedes Canedo
Ben A. Abderazek
Masahiro Sowa
Journal of Signal Processing Systems, 2010, 59 : 45 - 55
[45] Effective Bit-Width and Under-Approximation
Brummayer, Robert
Biere, Armin
COMPUTER AIDED SYSTEMS THEORY - EUROCAST 2009, 2009, 5717 : 304 - 311
[46] Quantization-Aware Interval Bound Propagation for Training Certifiably Robust Quantized Neural Networks
Lechner, Mathias
Zikelic, Dorde
Chatterjee, Krishnendu
Henzinger, Thomas A.
Rus, Daniela
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 12, 2023, : 14964 - 14973
[47] Knowledge-guided quantization-aware training for EEG-based emotion recognition
Zhong, Sheng-hua
Shi, Jiahao
Wang, Yi
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2025, 108
[48] Enhanced Precision Analysis for Accuracy-Aware Bit-Width Optimization Using Affine Arithmetic
Vakili, Shervin
Langlois, J. M. Pierre
Bois, Guy
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2013, 32 (12) : 1853 - 1865
[49] Accuracy-guaranteed bit-width optimization
Lee, Dong-U.
Gaffar, Altaf Abdul
Cheung, Ray C. C.
Mencer, Oskar
Luk, Wayne
Constantinides, George A.
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2006, 25 (10) : 1990 - 2000
[50] Compiling for Reduced Bit-Width Queue Processors
Canedo, Arquimedes
Abderazek, Ben A.
Sowa, Masahiro
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2010, 59 (01): : 45 - 55

← 1 2 3 4 5 →