2-bit Conformer quantization for automatic speech recognition

被引:0
|
作者
Rybakov, Oleg [1 ]
Meadowlark, Phoenix [1 ]
Ding, Shaojin [1 ]
Qiu, David [1 ]
Li, Jian [1 ]
Rim, David [1 ]
He, Yanzhang [1 ]
机构
[1] Google Res, Mountain View, CA 94043 USA
来源
关键词
speech recognition; model quantization; low-bit quantization;
D O I
10.21437/Interspeech.2023-1012
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Large speech models are rapidly gaining traction in research community. As a result, model compression has become an important topic, so that these models can fit in memory and be served with reduced cost. Practical approaches for compressing automatic speech recognition (ASR) model use int8 or int4 weight quantization. In this study, we propose to develop 2-bit ASR models. We explore the impact of symmetric and asymmetric quantization combined with sub-channel quantization and clipping on both LibriSpeech dataset and large-scale training data. We obtain a lossless 2-bit Conformer model with 32% model size reduction when compared to state of the art 4-bit Conformer model for LibriSpeech. With the large-scale training data, we obtain a 2-bit Conformer model with over 40% model size reduction against the 4-bit version at the cost of 17% relative word error rate degradation.
引用
收藏
页码:4908 / 4912
页数:5
相关论文
共 50 条
  • [21] QUANTIZATION AND BIT ALLOCATION IN SPEECH PROCESSING
    GRAY, AH
    MARKEL, JD
    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1976, 24 (06): : 459 - 473
  • [22] Robustness-aware 2-bit quantization with real-time performance for neural network
    Li, Xiaobin
    Jiang, Hongxu
    Zhang, Runhua
    Tian, Fangzheng
    Huang, Shuangxi
    Xu, Donghuan
    NEUROCOMPUTING, 2021, 455 : 12 - 22
  • [23] A High-Speed 2-bit/Cycle SAR ADC With Time-Domain Quantization
    Qiu, Lei
    Yang, Chuanshi
    Wang, Keping
    Zheng, Yuanjin
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2018, 26 (10) : 2175 - 2179
  • [24] Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition
    Audhkhasi, Kartik
    Huang, Yinghui
    Ramabhadran, Bhuvana
    Moreno, Pedro J.
    INTERSPEECH 2022, 2022, : 1026 - 1030
  • [25] A Variable Node Design with Check Node Aware Quantization Leveraging 2-Bit LDPC Decoding
    Mohr, Philipp
    Bauch, Gerhard
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 3484 - 3489
  • [26] Quantization Aware Training with Absolute-Cosine Regularization for Automatic Speech Recognition
    Hieu Duy Nguyen
    Alexandridis, Anastasios
    Mouchtaris, Athanasios
    INTERSPEECH 2020, 2020, : 3366 - 3370
  • [27] A 2-BIT VARIABLE LEAK VALVE
    HELBIG, H
    MILLIS, D
    TODD, L
    JOURNAL OF VACUUM SCIENCE & TECHNOLOGY, 1969, 6 (06): : 968 - &
  • [28] OPTIMAL 2-BIT BRANCH PREDICTORS
    NAIR, R
    IEEE TRANSACTIONS ON COMPUTERS, 1995, 44 (05) : 698 - 702
  • [29] Non-Zero Grid for Accurate 2-Bit Additive Power-of-Two CNN Quantization
    Kim, Young Min
    Han, Kyunghyun
    Lee, Wai-Kong
    Chang, Hyung Jin
    Hwang, Seong Oun
    IEEE ACCESS, 2023, 11 : 32051 - 32060
  • [30] SIMPLIFICATION OF 2-BIT ERROR CORRECTION
    NELSON, B
    COMPUTER DESIGN, 1982, 21 (01): : 127 - &