Lattice generation in attention-based speech recognition models

被引:6
|
作者
Zapotoczny, Michal [1 ]
Pietrzak, Piotr [1 ]
Lancucki, Adrian [1 ]
Chorowski, Jan [1 ]
机构
[1] Univ Wroclaw, Wroclaw, Poland
来源
INTERSPEECH 2019 | 2019年
关键词
speech recognition; beam search; artificial neural networks; attention-based models; lattice generation; decoding;
D O I
10.21437/Interspeech.2019-2667
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Attention-based neural speech recognition models are frequently decoded with beam search, which produces a tree of hypotheses. In many cases, such as when using external language models, numerous decoding hypotheses need to be considered, requiring large beam sizes during decoding. We demonstrate that it is possible to merge certain nodes in a tree of hypotheses, in order to obtain a decoding lattice, which increases the number of decoding hypotheses without increasing the number of candidates that are scored by the neural network. We propose a convolutional architecture, which facilitates comparing states of the model at different pi The experiments are carried on the Wall Street Journal dataset, where the lattice decoder obtains lower word error rates with smaller beam sizes, than an otherwise similar architecture with regular beam search.
引用
收藏
页码:2225 / 2229
页数:5
相关论文
共 50 条
  • [41] An Attention-based Predictive Agent for Handwritten Numeral/Alphabet Recognition via Generation
    Banerjee, Bonny
    Baruah, Murchana
    GAZE MEETS MACHINE LEARNING WORKSHOP, 2023, 226 : 4 - 19
  • [42] Attention-Based Multi-Learning Approach for Speech Emotion Recognition With Dilated Convolution
    Kakuba, Samuel
    Poulose, Alwin
    Han, Dong Seog
    IEEE ACCESS, 2022, 10 : 122302 - 122313
  • [43] Attention-based Visual Question Generation
    Patil, Charulata
    Kulkarni, Anagha
    2021 INTERNATIONAL CONFERENCE ON EMERGING SMART COMPUTING AND INFORMATICS (ESCI), 2021, : 82 - 86
  • [44] Attention-Based Image Caption Generation
    Manasa, M.
    Sowmya, D.
    Reddy, Y. Supriya
    Sreedevi, Pogula
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, MACHINE LEARNING AND APPLICATIONS, VOL 1, ICDSMLA 2023, 2025, 1273 : 364 - 369
  • [45] Significance of handcrafted features in human activity recognition with attention-based RNN models
    Abraham, Sonia
    James, Rekha K.
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (10) : 1151 - 1163
  • [46] Caption Generation for Sensing-Based Activity Using Attention-Based Learning Models
    Pati, Bhabanisankar
    Sahoo, Ajit Kumar
    Udgata, Siba K.
    IEEE SENSORS LETTERS, 2024, 8 (03) : 1 - 4
  • [47] An Attention-based Activity Recognition for Egocentric Video
    Matsuo, Kenji
    Yamada, Kentaro
    Ueno, Satoshi
    Naito, Sei
    2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2014, : 565 - +
  • [48] ON LATTICE GENERATION FOR LARGE VOCABULARY SPEECH RECOGNITION
    Rybach, David
    Riley, Michael
    Schalkwyk, Johan
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 228 - 235
  • [49] A Neural Autoregressive Approach to Attention-based Recognition
    Zheng, Yin
    Zemel, Richard S.
    Zhang, Yu-Jin
    Larochelle, Hugo
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 113 (01) : 67 - 79
  • [50] A Neural Autoregressive Approach to Attention-based Recognition
    Yin Zheng
    Richard S. Zemel
    Yu-Jin Zhang
    Hugo Larochelle
    International Journal of Computer Vision, 2015, 113 : 67 - 79