Dealing with Unknowns in Continual Learning for End-to-end Automatic Speech Recognition

被引：3

作者：

Sustek, Martin ^{[1
,2
]}

Sadhu, Samik ^{[2
]}

Hermansky, Hynek ^{[1
,2
,3
]}

机构：

[1] Brno Univ Technol, Brno, Czech Republic

[2] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA

[3] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA

来源：

INTERSPEECH 2022 | 2022年

关键词：

speech recognition; continual learning; multi stream speech recognition; ENVIRONMENT;

D O I：

10.21437/Interspeech.2022-11139

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Learning continually from data is a task executed effortlessly by humans but remains to be of significant challenge for machines. Moreover, when encountering unknown test scenarios machines fail to generalize. We propose a mathematically motivated dynamically expanding end-to-end model of independent sequence-to-sequence components trained on different data sets that avoid catastrophically forgetting knowledge acquired from previously seen data while seamlessly integrating knowledge from new data. During inference, the likelihoods of the unknown test scenario are computed using internal model activation distributions. The inference made by each independent component is weighted by the normalized likelihood values to obtain the final decision.

引用

页码：1046 / 1050

页数：5

共 50 条

[31] Investigation of Transfer Learning for End-to-End Russian Speech Recognition
Kipyatkova, Irina
SPEECH AND COMPUTER, SPECOM 2022, 2022, 13721 : 349 - 357
[32] SCaLa: Supervised Contrastive Learning for End-to-End Speech Recognition
Fu, Li
Li, Xiaoxiao
Wang, Runyu
Fan, Lu
Zhang, Zhengchen
Chen, Meng
Wu, Youzheng
He, Xiaodong
INTERSPEECH 2022, 2022, : 1006 - 1010
[33] Combining Articulatory Features with End-to-End Learning in Speech Recognition
Qu, Leyuan
Weber, Cornelius
Lakomkin, Egor
Twiefel, Johannes
Wermter, Stefan
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT III, 2018, 11141 : 500 - 510
[34] End-to-End Speech Recognition in Russian
Markovnikov, Nikita
Kipyatkova, Irina
Lyakso, Elena
SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 377 - 386
[35] END-TO-END MULTIMODAL SPEECH RECOGNITION
Palaskar, Shruti
Sanabria, Ramon
Metze, Florian
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5774 - 5778
[36] Overview of end-to-end speech recognition
Wang, Song
Li, Guanyu
2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187
[37] End-to-end Accented Speech Recognition
Viglino, Thibault
Motlicek, Petr
Cernak, Milos
INTERSPEECH 2019, 2019, : 2140 - 2144
[38] Multichannel End-to-end Speech Recognition
Ochiai, Tsubasa
Watanabe, Shinji
Hori, Takaaki
Hershey, John R.
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[39] END-TO-END AUDIOVISUAL SPEECH RECOGNITION
Petridis, Stavros
Stafylakis, Themos
Ma, Pingchuan
Cai, Feipeng
Tzimiropoulos, Georgios
Pantic, Maja
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6548 - 6552
[40] END-TO-END ANCHORED SPEECH RECOGNITION
Wang, Yiming
Fan, Xing
Chen, I-Fan
Liu, Yuzong
Chen, Tongfei
Hoffmeister, Bjorn
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7090 - 7094

← 1 2 3 4 5 →