On the Optimal Interpolation Weights for Hybrid Autoregressive Transducer Model

被引:0
|
作者
Variani, Ehsan [1 ]
Riley, Michael [1 ]
Rybach, David [1 ]
Allauzen, Cyril [1 ]
Chen, Tongzhou [1 ]
Ramabhadran, Bhuvana [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
来源
关键词
speech recognition; two-pass recognition; rescoring weights;
D O I
10.21437/Interspeech.2022-4
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper explores rescoring strategies to improve a two-pass speech recognition system when the first-pass is a hybrid autoregressive transducer model and the second-pass is a neural language model. The main focus is on the scores provided by each of these models, their quantitative analysis, how to improve them and the best way to combine them to achieve better recognition accuracy. Several analyses are presented to emphasize the importance of the choice of the integration weights for combining the first-pass and the second-pass scores. A sequence level combination weight estimation model along with four training criteria are proposed which allows adaptive integration of the scores per acoustic sequence. The effectiveness of this algorithm is demonstrated by constructing and analyzing models on the Librispeech data set. It is shown that the proposed adaptive weight interpolation technique achieves 5 % relative gain over the baseline model with non-adaptive weights.
引用
收藏
页码:1646 / 1650
页数:5
相关论文
共 50 条
  • [1] MODULAR HYBRID AUTOREGRESSIVE TRANSDUCER
    Meng, Zhong
    Chen, Tongzhou
    Prabhavalkar, Rohit
    Zhang, Yu
    Wang, Gary
    Audhkhasi, Kartik
    Emond, Jesse
    Strohman, Trevor
    Ramabhadran, Bhuvana
    Huang, W. Ronny
    Variani, Ehsan
    Huang, Yinghui
    Moreno, Pedro J.
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 197 - 204
  • [2] HYBRID AUTOREGRESSIVE TRANSDUCER (HAT)
    Variani, Ehsan
    Rybach, David
    Allauzen, Cyril
    Riley, Michael
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6139 - 6143
  • [3] A More Accurate Internal Language Model Score Estimation for the Hybrid Autoregressive Transducer
    Lee, Kyungmin
    Kim, Haeri
    Jin, Sichen
    Park, Jinhwan
    Han, Youngho
    INTERSPEECH 2023, 2023, : 869 - 873
  • [4] On Minimum Word Error Rate Training of the Hybrid Autoregressive Transducer
    Lu, Liang
    Meng, Zhong
    Kanda, Naoyuki
    Li, Jinyu
    Gong, Yifan
    INTERSPEECH 2021, 2021, : 3435 - 3439
  • [5] Adaptation of hybrid ANN/HMM using weights interpolation
    Scanzio, Stefano
    Laface, Pietro
    Gemello, Roberto
    Mana, Franco
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 5891 - 5894
  • [6] Don't Fire Me, A Kernel Autoregressive Hybrid Model For Optimal Layoff Plan
    Luo, Zhiling
    Li, Ying
    Fu, Ruisheng
    Yin, Jianwei
    2016 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2016, 2016, : 470 - 477
  • [7] Ensemble optimal interpolation schemes for assimilating Argo profiles into a hybrid coordinate ocean model
    Xie, Jiping
    Zhu, Jiang
    OCEAN MODELLING, 2010, 33 (3-4) : 283 - 298
  • [8] INTERPOLATION INEQUALITIES WITH WEIGHTS
    LIN, CS
    COMMUNICATIONS IN PARTIAL DIFFERENTIAL EQUATIONS, 1986, 11 (14) : 1515 - 1538
  • [9] REAL INTERPOLATION WITH WEIGHTS
    SAGHER, Y
    INDIANA UNIVERSITY MATHEMATICS JOURNAL, 1981, 30 (01) : 113 - 121
  • [10] Spatial weights matrix selection and model averaging for spatial autoregressive models
    Zhang, Xinyu
    Yu, Jihai
    JOURNAL OF ECONOMETRICS, 2018, 203 (01) : 1 - 18