HYBRID AUTOREGRESSIVE TRANSDUCER (HAT)

被引:0
|
作者
Variani, Ehsan
Rybach, David
Allauzen, Cyril
Riley, Michael
机构
来源
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年
关键词
ASR; Encoder-decoder; Beam Search;
D O I
10.1109/icassp40776.2020.9053600
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes and evaluates the hybrid autoregressive transducer (HAT) model, a time-synchronous encoder-decoder model that preserves the modularity of conventional automatic speech recognition systems. The HAT model provides a way to measure the quality of the internal language model that can be used to decide whether inference with an external language model is beneficial or not. We evaluate our proposed model on a large-scale voice search task. Our experiments show significant improvements in WER compared to the state-of-the-art approaches (1).
引用
收藏
页码:6139 / 6143
页数:5
相关论文
共 50 条
  • [1] MODULAR HYBRID AUTOREGRESSIVE TRANSDUCER
    Meng, Zhong
    Chen, Tongzhou
    Prabhavalkar, Rohit
    Zhang, Yu
    Wang, Gary
    Audhkhasi, Kartik
    Emond, Jesse
    Strohman, Trevor
    Ramabhadran, Bhuvana
    Huang, W. Ronny
    Variani, Ehsan
    Huang, Yinghui
    Moreno, Pedro J.
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 197 - 204
  • [2] On the Optimal Interpolation Weights for Hybrid Autoregressive Transducer Model
    Variani, Ehsan
    Riley, Michael
    Rybach, David
    Allauzen, Cyril
    Chen, Tongzhou
    Ramabhadran, Bhuvana
    INTERSPEECH 2022, 2022, : 1646 - 1650
  • [3] On Minimum Word Error Rate Training of the Hybrid Autoregressive Transducer
    Lu, Liang
    Meng, Zhong
    Kanda, Naoyuki
    Li, Jinyu
    Gong, Yifan
    INTERSPEECH 2021, 2021, : 3435 - 3439
  • [4] A More Accurate Internal Language Model Score Estimation for the Hybrid Autoregressive Transducer
    Lee, Kyungmin
    Kim, Haeri
    Jin, Sichen
    Park, Jinhwan
    Han, Youngho
    INTERSPEECH 2023, 2023, : 869 - 873
  • [5] TransGesture: Autoregressive Gesture Generation with RNN-Transducer
    Kaneko, Naoshi
    Mitsubayashi, Yuna
    Mu, Geng
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2022, 2022, : 753 - 757
  • [6] HYBRID ACOUSTOOPTICAL TRANSDUCER
    ANIKEEV, DI
    BOCHAROV, YV
    KAPUSTINA, OA
    SOVIET PHYSICS ACOUSTICS-USSR, 1991, 37 (04): : 313 - 315
  • [7] Hybrid Autoregressive and Non-Autoregressive Transformer Models for Speech Recognition
    Tian, Zhengkun
    Yi, Jiangyan
    Tao, Jianhua
    Zhang, Shuai
    Wen, Zhengqi
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 762 - 766
  • [8] Improvement of Degeneracy for Hybrid Transducer
    Ting, Yung
    Tan, Le Ba
    Hariyanto, Gunawan
    Hou, Bing-Kuan
    Van Thang, Lang
    Chen, Cheng-Yu
    Son, Tran Thai
    PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, VOL 1, PTS A AND B, 2010, : 773 - 780
  • [9] DELIBERATION OF STREAMING RNN-TRANSDUCER BY NON-AUTOREGRESSIVE DECODING
    Wang, Weiran
    Hu, Ke
    Sainath, Tara N.
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7452 - 7456
  • [10] HYBRID MICROELECTRONICS - OLD HAT FOR NEW TIMES
    ELSHABINIRIAD, A
    STEPHENSON, FW
    IEEE CIRCUITS AND DEVICES MAGAZINE, 1990, 6 (02): : 42 - 47