Dealing with Unknowns in Continual Learning for End-to-end Automatic Speech Recognition

被引：3

作者：

Sustek, Martin ^{[1
,2
]}

Sadhu, Samik ^{[2
]}

Hermansky, Hynek ^{[1
,2
,3
]}

机构：

[1] Brno Univ Technol, Brno, Czech Republic

[2] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA

[3] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA

来源：

INTERSPEECH 2022 | 2022年

关键词：

speech recognition; continual learning; multi stream speech recognition; ENVIRONMENT;

D O I：

10.21437/Interspeech.2022-11139

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Learning continually from data is a task executed effortlessly by humans but remains to be of significant challenge for machines. Moreover, when encountering unknown test scenarios machines fail to generalize. We propose a mathematically motivated dynamically expanding end-to-end model of independent sequence-to-sequence components trained on different data sets that avoid catastrophically forgetting knowledge acquired from previously seen data while seamlessly integrating knowledge from new data. During inference, the likelihoods of the unknown test scenario are computed using internal model activation distributions. The inference made by each independent component is weighted by the normalized likelihood values to obtain the final decision.

引用

页码：1046 / 1050

页数：5

共 50 条

[21] Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition
Belinkov, Yonatan
Ali, Ahmed
Glass, James
INTERSPEECH 2019, 2019, : 81 - 85
[22] Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
Parcollet, Titouan
Zhang, Ying
Morchid, Mohamed
Trabelsi, Chiheb
Linares, Georges
De Mori, Renato
Bengio, Yoshua
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 22 - 26
[23] A Neural Time Alignment Module for End-to-End Automatic Speech Recognition
Jiang, Dongcheng
Zhang, Chao
Woodland, Philip C.
INTERSPEECH 2023, 2023, : 1374 - 1378
[24] Unidirectional Neural Network Architectures for End-to-End Automatic Speech Recognition
Moritz, Niko
Hori, Takaaki
Le Roux, Jonathan
INTERSPEECH 2019, 2019, : 76 - 80
[25] Towards end-to-end training of automatic speech recognition for nigerian pidgin
Ajisafe, Daniel
Adegboro, Oluwabukola
Oduntan, Esther
Arulogun, Tayo
arXiv, 2020,
[26] Integrated End-to-End Automatic Speech Recognition for Languages for Agglutinative Languages
Bekarystankyzy, Akbayan
Mamyrbayev, Orken
Anarbekova, Tolganay
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (06)
[27] A Transformer-Based End-to-End Automatic Speech Recognition Algorithm
Dong, Fang
Qian, Yiyang
Wang, Tianlei
Liu, Peng
Cao, Jiuwen
IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1592 - 1596
[28] End-to-End Speech Recognition Sequence Training With Reinforcement Learning
Tjandra, Andros
Sakti, Sakriani
Nakamura, Satoshi
IEEE ACCESS, 2019, 7 : 79758 - 79769
[29] End-to-End Audiovisual Speech Recognition System With Multitask Learning
Tao, Fei
Busso, Carlos
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 1 - 11
[30] Arabic speech recognition using end-to-end deep learning
Alsayadi, Hamzah A.
Abdelhamid, Abdelaziz A.
Hegazy, Islam
Fayed, Zaki T.
IET SIGNAL PROCESSING, 2021, 15 (08) : 521 - 534

← 1 2 3 4 5 →