AdaQAT: Adaptive Bit-Width Quantization-Aware Training

被引：0

作者：

Gernigon, Cedric ^{[1
]}

Filip, Silviu-Ioan ^{[1
]}

Sentieys, Olivier ^{[1
]}

Coggiola, Clement ^{[2
]}

Bruno, Mickael ^{[2
]}

机构：

[1] Univ Rennes, INRIA, CNRS, IRISA, F-35000 Rennes, France

[2] CNES, Spacecraft Tech, Onboard Data Handling, Toulouse, France

来源：

2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024 | 2024年

关键词：

Neural Network Compression; Quantization Aware Training; Adaptive Bit-Width Optimization;

D O I：

10.1109/AICAS59952.2024.10595895

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large-scale deep neural networks (DNNs) have achieved remarkable success in many application scenarios. However, high computational complexity and energy costs of modern DNNs make their deployment on edge devices challenging. Model quantization is a common approach to deal with deployment constraints, but searching for optimized bit-widths can be challenging. In this work, we present Adaptive Bit-Width Quantization Aware Training (AdaQAT), a learning-based method that automatically optimizes weight and activation signal bit-widths during training for more efficient DNN inference. We use relaxed real-valued bit-widths that are updated using a gradient descent rule, but are otherwise discretized for all quantization operations. The result is a simple and flexible QAT approach for mixed-precision uniform quantization problems. Compared to other methods that are generally designed to be run on a pretrained network, AdaQAT works well in both training from scratch and fine-tuning scenarios. Initial results on the CIFAR-10 and ImageNet datasets using ResNet20 and ResNet18 models, respectively, indicate that our method is competitive with other state-of-the-art mixed-precision quantization approaches.

引用

页码：442 / 446

页数：5

共 50 条

[1] Disentangled Loss for Low-Bit Quantization-Aware Training
Allenet, Thibault
Briand, David
Bichler, Olivier
Sentieys, Olivier
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2787 - 2791
[2] Overcoming Oscillations in Quantization-Aware Training
Nagel, Markus
Fournarakis, Marios
Bondarenko, Yelysei
Blankevoort, Tijmen
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[3] QUANTIZATION AND TRAINING OF LOW BIT-WIDTH CONVOLUTIONAL NEURAL NETWORKS FOR OBJECT DETECTION
Yin, Penghang
Zhang, Shuai
Qi, Yingyong
Xin, Jack
JOURNAL OF COMPUTATIONAL MATHEMATICS, 2019, 37 (03) : 349 - 360
[4] Minimizing Energy of DNN Training with Adaptive Bit-width and Voltage Scaling
Cheng, TaiYu
Hashimoto, Masanori
2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
[5] Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks
Tuan Hoang
Thanh-Toan Do
Nguyen, Tam, V
Cheung, Ngai-Man
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 2111 - 2118
[6] Residual Quantization for Low Bit-Width Neural Networks
Li, Zefan
Ni, Bingbing
Yang, Xiaokang
Zhang, Wenjun
Gao, Wen
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 214 - 227
[7] Overcoming Forgetting Catastrophe in Quantization-Aware Training
Chen, Ting-An
Yang, De-Nian
Chen, Ming-Syan
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17312 - 17321
[8] Quantization-Aware Training With Dynamic and Static Pruning
An, Sangho
Shin, Jongyun
Kim, Jangho
IEEE ACCESS, 2025, 13 : 57476 - 57484
[9] Regularizing Activation Distribution for Ultra Low-bit Quantization-Aware Training of MobileNets
Park, Seongmin
Sung, Wonyong
Choi, Jungwook
2022 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2022, : 138 - 143
[10] PowerBit - Power aware arithmetic bit-width optimization
Gaffar, Altaf Abdul
Clarke, Jonathan A.
Constantinides, George A.
2006 IEEE INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY, PROCEEDINGS, 2006, : 289 - +

← 1 2 3 4 5 →