CHARACTER-AWARE ATTENTION-BASED END-TO-END SPEECH RECOGNITION

被引：0

作者：

Meng, Zhong ^{[1
]}

Gaur, Yashesh ^{[1
]}

Li, Jinyu ^{[1
]}

Gong, Yifan ^{[1
]}

机构：

[1] Microsoft Corp, Redmond, WA 98052 USA

来源：

2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019) | 2019年

关键词：

character-aware; end-to-end; attention; encoder-decoder; speech recognition; NEURAL-NETWORKS;

D O I：

10.1109/asru46091.2019.9004018

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Predicting words and subword units (WSUs) as the output has shown to be effective for the attention-based encoder-decoder (AED) model in end-to-end speech recognition. However, as one input to the decoder recurrent neural network (RNN), each WSU embedding is learned independently through context and acoustic information in a purely data-driven fashion. Little effort has been made to explicitly model the morphological relationships among WSUs. In this work, we propose a novel character-aware (CA) AED model in which each WSU embedding is computed by summarizing the embeddings of its constituent characters using a CA-RNN. This WSU-independent CA-RNN is jointly trained with the encoder, the decoder and the attention network of a conventional AED to predict WSUs. With CA-AED, the embeddings of morphologically similar WSUs are naturally and directly correlated through the CA-RNN in addition to the semantic and acoustic relations modeled by a traditional AED. Moreover, CA-AED significantly reduces the model parameters in a traditional AED by replacing the large pool of WSU embeddings with a much smaller set of character embeddings. On a 3400 hours Microsoft Cortana dataset, CA-AED achieves up to 11.9% relative WER improvement over a strong AED baseline with 27.1% fewer model parameters.

引用

页码：949 / 955

页数：7

共 50 条

[31] RAttSR: A Novel Low-Cost Reconstructed Attention-Based End-to-End Speech Recognizer
Paul, Bachchu
Phadikar, Santanu
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, 43 (04) : 2454 - 2476
[32] Personality-aware Training based Speaker Adaptation for End-to-end Speech Recognition
Gu, Yue
Du, Zhihao
Zhang, Shiliang
Chen, Qian
Han, Jiqing
INTERSPEECH 2023, 2023, : 1249 - 1253
[33] Attention-based neural network for end-to-end music separation
Wang, Jing
Liu, Hanyue
Ying, Haorong
Qiu, Chuhan
Li, Jingxin
Anwar, Muhammad Shahid
CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2023, 8 (02) : 355 - 363
[34] End-to-End Speech Recognition with Auditory Attention for Multi-Microphone Distance Speech Recognition
Kim, Suyoun
Lane, Ian
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3867 - 3871
[35] PHOEBE: PRONUNCIATION-AWARE CONTEXTUALIZATION FOR END-TO-END SPEECH RECOGNITION
Bruguier, Antoine
Prabhavalkar, Rohit
Pundak, Golan
Sainath, Tara N.
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6171 - 6175
[36] SIMPLIFIED SELF-ATTENTION FOR TRANSFORMER-BASED END-TO-END SPEECH RECOGNITION
Luo, Haoneng
Zhang, Shiliang
Lei, Ming
Xie, Lei
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 75 - 81
[37] TRANSFORMER-BASED END-TO-END SPEECH RECOGNITION WITH LOCAL DENSE SYNTHESIZER ATTENTION
Xu, Menglong
Li, Shengqiang
Zhang, Xiao-Lei
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5899 - 5903
[38] Exploring attention mechanisms based on summary information for end-to-end automatic speech recognition
Xue, Jiabin
Zheng, Tieran
Han, Jiqing
NEUROCOMPUTING, 2021, 465 : 514 - 524
[39] TRANSFORMER-BASED ONLINE CTC/ATTENTION END-TO-END SPEECH RECOGNITION ARCHITECTURE
Miao, Haoran
Cheng, Gaofeng
Gao, Changfeng
Zhang, Pengyuan
Yan, Yonghong
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6084 - 6088
[40] STREAMING END-TO-END SPEECH RECOGNITION WITH JOINT CTC-ATTENTION BASED MODELS
Moritz, Niko
Hori, Takaaki
Le Roux, Jonathan
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 936 - 943

← 1 2 3 4 5 →