4-bit Conformer with Native Quantization Aware Training for Speech Recognition

被引：3

作者：

Ding, Shaojin ^{[1
]}

Meadowlark, Phoenix ^{[1
]}

He, Yanzhang ^{[1
]}

Lew, Lukasz ^{[1
]}

Agrawal, Shivani ^{[1
]}

Rybakov, Oleg ^{[1
]}

机构：

[1] Google LLC, Mountain View, CA 94043 USA

来源：

INTERSPEECH 2022 | 2022年

关键词：

speech recognition; model quantization; 4-bit quantization;

D O I：

10.21437/Interspeech.2022-10809

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Reducing the latency and model size has always been a significant research problem for live Automatic Speech Recognition (ASR) application scenarios. Along this direction, model quantization has become an increasingly popular approach to compress neural networks and reduce computation cost. Most of the existing practical ASR systems apply post-training 8-bit quantization. To achieve a higher compression rate without introducing additional performance regression, in this study, we propose to develop 4-bit ASR models with native quantization aware training, which leverages native integer operations to effectively optimize both training and inference. We conducted two experiments on state-of-the-art Conformer-based ASR models to evaluate our proposed quantization technique. First, we explored the impact of different precisions for both weight and activation quantization on the LibriSpeech dataset, and obtained a lossless 4-bit Conformer model with 7.7x size reduction compared to the float32 model. Following this, we for the first time investigated and revealed the viability of 4-bit quantization on a practical ASR system that is trained with large-scale datasets, and produced a lossless Conformer ASR model with mixed 4bit and 8-bit weights that has 5x size reduction compared to the float32 model.

引用

页码：1711 / 1715

页数：5

共 36 条

[1] 4-bit Quantization of LSTM-based Speech Recognition Models
Fasoli, Andrea
Chen, Chia-Yu
Serrano, Mauricio
Sun, Xiao
Wang, Naigang
Venkataramani, Swagath
Saon, George
Cui, Xiaodong
Kingsbury, Brian
Zhang, Wei
Tuske, Zoltan
Gopalakrishnan, Kailash
INTERSPEECH 2021, 2021, : 2586 - 2590
[2] 2-bit Conformer quantization for automatic speech recognition
Rybakov, Oleg
Meadowlark, Phoenix
Ding, Shaojin
Qiu, David
Li, Jian
Rim, David
He, Yanzhang
INTERSPEECH 2023, 2023, : 4908 - 4912
[3] Post training 4-bit quantization of convolutional networks for rapid-deployment
Banner, Ron
Nahshan, Yury
Soudry, Daniel
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[4] Lossless 4-bit Quantization of Architecture Compressed Conformer ASR Systems on the 300-hr Switchboard Corpus
Li, Zhaoqing
Wang, Tianzi
Deng, Jiajun
Xu, Junhao
Hu, Shoukang
Liu, Xunying
INTERSPEECH 2023, 2023, : 3332 - 3336
[5] Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition
Zhen, Kai
Nguyen, Hieu Duy
Chinta, Raviteja
Susanj, Nathan
Mouchtaris, Athanasios
Afzal, Tariq
Rastrow, Ariya
INTERSPEECH 2022, 2022, : 3033 - 3037
[6] Training Transformers with 4-bit Integers
Xi, Haocheng
Li, Changhao
Chen, Jianfei
Zhu, Jun
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[7] Quantization Aware Training with Absolute-Cosine Regularization for Automatic Speech Recognition
Hieu Duy Nguyen
Alexandridis, Anastasios
Mouchtaris, Athanasios
INTERSPEECH 2020, 2020, : 3366 - 3370
[8] AdaQAT: Adaptive Bit-Width Quantization-Aware Training
Gernigon, Cedric
Filip, Silviu-Ioan
Sentieys, Olivier
Coggiola, Clement
Bruno, Mickael
2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024, 2024, : 442 - 446
[9] Disentangled Loss for Low-Bit Quantization-Aware Training
Allenet, Thibault
Briand, David
Bichler, Olivier
Sentieys, Olivier
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2787 - 2791
[10] Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems
Deng, Jiajun
Li, Guinan
Xie, Xurong
Jin, Zengrui
Cui, Mingyu
Wang, Tianzi
Hu, Shujie
Geng, Mengzhe
Liu, Xunying
INTERSPEECH 2023, 2023, : 3342 - 3346

← 1 2 3 4 →