Lattice generation in attention-based speech recognition models

被引：6

作者：

Zapotoczny, Michal ^{[1
]}

Pietrzak, Piotr ^{[1
]}

Lancucki, Adrian ^{[1
]}

Chorowski, Jan ^{[1
]}

机构：

[1] Univ Wroclaw, Wroclaw, Poland

来源：

INTERSPEECH 2019 | 2019年

关键词：

speech recognition; beam search; artificial neural networks; attention-based models; lattice generation; decoding;

D O I：

10.21437/Interspeech.2019-2667

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

Attention-based neural speech recognition models are frequently decoded with beam search, which produces a tree of hypotheses. In many cases, such as when using external language models, numerous decoding hypotheses need to be considered, requiring large beam sizes during decoding. We demonstrate that it is possible to merge certain nodes in a tree of hypotheses, in order to obtain a decoding lattice, which increases the number of decoding hypotheses without increasing the number of candidates that are scored by the neural network. We propose a convolutional architecture, which facilitates comparing states of the model at different pi The experiments are carried on the Wall Street Journal dataset, where the lattice decoder obtains lower word error rates with smaller beam sizes, than an otherwise similar architecture with regular beam search.

引用

页码：2225 / 2229

页数：5

共 50 条

[21] ATTENTION-BASED END-TO-END SPEECH RECOGNITION ON VOICE SEARCH
Shan, Changhao
Zhang, Junbo
Wang, Yujun
Xie, Lei
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4764 - 4768
[22] EXPLICIT ALIGNMENT OF TEXT AND SPEECH ENCODINGS FOR ATTENTION-BASED END-TO-END SPEECH RECOGNITION
Drexler, Jennifer
Glass, James
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 913 - 919
[23] Thank you for attention: A survey on attention-based artificial neural networks for automatic speech recognition
Karmakar, Priyabrata
Teng, Shyh Wei
Lu, Guojun
INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 23
[24] Upgraded Attention-Based Local Feature Learning Block for Speech Emotion Recognition
Zhao, Huan
Gao, Yingxue
Xiao, Yufeng
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT II, 2021, 12713 : 118 - 130
[25] A novel dual attention-based BLSTM with hybrid features in speech emotion recognition
Chen, Qiupu
Huang, Guimin
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 102
[26] Attention-Based End-to-End Named Entity Recognition from Speech
Porjazovski, Dejan
Leinonen, Juho
Kurimo, Mikko
TEXT, SPEECH, AND DIALOGUE, TSD 2021, 2021, 12848 : 469 - 480
[27] Attention-based LSTM with Multi-task Learning for Distant Speech Recognition
Zhang, Yu
Zhang, Pengyuan
Yan, Yonghong
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3857 - 3861
[28] CHARACTER-AWARE ATTENTION-BASED END-TO-END SPEECH RECOGNITION
Meng, Zhong
Gaur, Yashesh
Li, Jinyu
Gong, Yifan
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 949 - 955
[29] A novel dual attention-based BLSTM with hybrid features in speech emotion recognition
Chen, Qiupu
Huang, Guimin
Engineering Applications of Artificial Intelligence, 2021, 102
[30] Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition
Sterpu, George
Saam, Christian
Harte, Naomi
ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 111 - 115

← 1 2 3 4 5 →