4-bit Conformer with Native Quantization Aware Training for Speech Recognition

被引:3
|
作者
Ding, Shaojin [1 ]
Meadowlark, Phoenix [1 ]
He, Yanzhang [1 ]
Lew, Lukasz [1 ]
Agrawal, Shivani [1 ]
Rybakov, Oleg [1 ]
机构
[1] Google LLC, Mountain View, CA 94043 USA
来源
关键词
speech recognition; model quantization; 4-bit quantization;
D O I
10.21437/Interspeech.2022-10809
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Reducing the latency and model size has always been a significant research problem for live Automatic Speech Recognition (ASR) application scenarios. Along this direction, model quantization has become an increasingly popular approach to compress neural networks and reduce computation cost. Most of the existing practical ASR systems apply post-training 8-bit quantization. To achieve a higher compression rate without introducing additional performance regression, in this study, we propose to develop 4-bit ASR models with native quantization aware training, which leverages native integer operations to effectively optimize both training and inference. We conducted two experiments on state-of-the-art Conformer-based ASR models to evaluate our proposed quantization technique. First, we explored the impact of different precisions for both weight and activation quantization on the LibriSpeech dataset, and obtained a lossless 4-bit Conformer model with 7.7x size reduction compared to the float32 model. Following this, we for the first time investigated and revealed the viability of 4-bit quantization on a practical ASR system that is trained with large-scale datasets, and produced a lossless Conformer ASR model with mixed 4bit and 8-bit weights that has 5x size reduction compared to the float32 model.
引用
收藏
页码:1711 / 1715
页数:5
相关论文
共 36 条
  • [1] 4-bit Quantization of LSTM-based Speech Recognition Models
    Fasoli, Andrea
    Chen, Chia-Yu
    Serrano, Mauricio
    Sun, Xiao
    Wang, Naigang
    Venkataramani, Swagath
    Saon, George
    Cui, Xiaodong
    Kingsbury, Brian
    Zhang, Wei
    Tuske, Zoltan
    Gopalakrishnan, Kailash
    INTERSPEECH 2021, 2021, : 2586 - 2590
  • [2] 2-bit Conformer quantization for automatic speech recognition
    Rybakov, Oleg
    Meadowlark, Phoenix
    Ding, Shaojin
    Qiu, David
    Li, Jian
    Rim, David
    He, Yanzhang
    INTERSPEECH 2023, 2023, : 4908 - 4912
  • [3] Post training 4-bit quantization of convolutional networks for rapid-deployment
    Banner, Ron
    Nahshan, Yury
    Soudry, Daniel
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [4] Lossless 4-bit Quantization of Architecture Compressed Conformer ASR Systems on the 300-hr Switchboard Corpus
    Li, Zhaoqing
    Wang, Tianzi
    Deng, Jiajun
    Xu, Junhao
    Hu, Shoukang
    Liu, Xunying
    INTERSPEECH 2023, 2023, : 3332 - 3336
  • [5] Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition
    Zhen, Kai
    Nguyen, Hieu Duy
    Chinta, Raviteja
    Susanj, Nathan
    Mouchtaris, Athanasios
    Afzal, Tariq
    Rastrow, Ariya
    INTERSPEECH 2022, 2022, : 3033 - 3037
  • [6] Training Transformers with 4-bit Integers
    Xi, Haocheng
    Li, Changhao
    Chen, Jianfei
    Zhu, Jun
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [7] Quantization Aware Training with Absolute-Cosine Regularization for Automatic Speech Recognition
    Hieu Duy Nguyen
    Alexandridis, Anastasios
    Mouchtaris, Athanasios
    INTERSPEECH 2020, 2020, : 3366 - 3370
  • [8] AdaQAT: Adaptive Bit-Width Quantization-Aware Training
    Gernigon, Cedric
    Filip, Silviu-Ioan
    Sentieys, Olivier
    Coggiola, Clement
    Bruno, Mickael
    2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024, 2024, : 442 - 446
  • [9] Disentangled Loss for Low-Bit Quantization-Aware Training
    Allenet, Thibault
    Briand, David
    Bichler, Olivier
    Sentieys, Olivier
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2787 - 2791
  • [10] Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems
    Deng, Jiajun
    Li, Guinan
    Xie, Xurong
    Jin, Zengrui
    Cui, Mingyu
    Wang, Tianzi
    Hu, Shujie
    Geng, Mengzhe
    Liu, Xunying
    INTERSPEECH 2023, 2023, : 3342 - 3346