2-bit Conformer quantization for automatic speech recognition

被引:0
|
作者
Rybakov, Oleg [1 ]
Meadowlark, Phoenix [1 ]
Ding, Shaojin [1 ]
Qiu, David [1 ]
Li, Jian [1 ]
Rim, David [1 ]
He, Yanzhang [1 ]
机构
[1] Google Res, Mountain View, CA 94043 USA
来源
关键词
speech recognition; model quantization; low-bit quantization;
D O I
10.21437/Interspeech.2023-1012
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Large speech models are rapidly gaining traction in research community. As a result, model compression has become an important topic, so that these models can fit in memory and be served with reduced cost. Practical approaches for compressing automatic speech recognition (ASR) model use int8 or int4 weight quantization. In this study, we propose to develop 2-bit ASR models. We explore the impact of symmetric and asymmetric quantization combined with sub-channel quantization and clipping on both LibriSpeech dataset and large-scale training data. We obtain a lossless 2-bit Conformer model with 32% model size reduction when compared to state of the art 4-bit Conformer model for LibriSpeech. With the large-scale training data, we obtain a 2-bit Conformer model with over 40% model size reduction against the 4-bit version at the cost of 17% relative word error rate degradation.
引用
收藏
页码:4908 / 4912
页数:5
相关论文
共 50 条
  • [1] 4-bit Conformer with Native Quantization Aware Training for Speech Recognition
    Ding, Shaojin
    Meadowlark, Phoenix
    He, Yanzhang
    Lew, Lukasz
    Agrawal, Shivani
    Rybakov, Oleg
    INTERSPEECH 2022, 2022, : 1711 - 1715
  • [2] CORRELATORS WITH 2-BIT QUANTIZATION
    COOPER, BFC
    AUSTRALIAN JOURNAL OF PHYSICS, 1970, 23 (04): : 521 - &
  • [3] PITCH-ADAPTIVE DPCM CODING OF SPEECH WITH 2-BIT QUANTIZATION AND FIXED SPECTRUM PREDICTION
    JAYANT, NS
    BELL SYSTEM TECHNICAL JOURNAL, 1977, 56 (03): : 439 - 454
  • [4] QuIP: 2-Bit Quantization of Large Language Models With Guarantees
    Chee, Jerry
    Cai, Yaohui
    Kuleshov, Volodymyr
    De Sa, Christopher
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [5] EFFICIENT CONFORMER: PROGRESSIVE DOWNSAMPLING AND GROUPED ATTENTION FOR AUTOMATIC SPEECH RECOGNITION
    Burchi, Maxime
    Vielzeuf, Valentin
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 8 - 15
  • [6] Research of 2-bit Quantization Arithmetic in DS-SS Receiver
    Zhao Hongwei
    Lian Baowang
    Feng Juan
    Lian Jie
    ICIEA: 2009 4TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, VOLS 1-6, 2009, : 2686 - 2689
  • [7] Quantization Error Reduction for the Phased Array with 2-bit Phase Shifter
    Song, Jian
    Wang, Jun
    Peng, Kewu
    Pan, Changyong
    Yang, Zhixing
    WIRELESS PERSONAL COMMUNICATIONS, 2010, 52 (01) : 29 - 41
  • [8] Quantization Error Reduction for the Phased Array with 2-bit Phase Shifter
    Jian Song
    Jun Wang
    Kewu Peng
    Changyong Pan
    Zhixing Yang
    Wireless Personal Communications, 2010, 52 : 29 - 41
  • [9] Sampleformer: An efficient conformer-based Neural Network for Automatic Speech Recognition
    Fan, Zeping
    Zhang, Xuejun
    Huang, Min
    Bu, Zhaohui
    INTELLIGENT DATA ANALYSIS, 2024, 28 (06) : 1647 - 1659
  • [10] 2-BIT CORRELATION
    HORNER, JL
    BARTELT, HO
    APPLIED OPTICS, 1985, 24 (18): : 2889 - 2893