Dealing with Unknowns in Continual Learning for End-to-end Automatic Speech Recognition

被引:3
|
作者
Sustek, Martin [1 ,2 ]
Sadhu, Samik [2 ]
Hermansky, Hynek [1 ,2 ,3 ]
机构
[1] Brno Univ Technol, Brno, Czech Republic
[2] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
[3] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA
来源
关键词
speech recognition; continual learning; multi stream speech recognition; ENVIRONMENT;
D O I
10.21437/Interspeech.2022-11139
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Learning continually from data is a task executed effortlessly by humans but remains to be of significant challenge for machines. Moreover, when encountering unknown test scenarios machines fail to generalize. We propose a mathematically motivated dynamically expanding end-to-end model of independent sequence-to-sequence components trained on different data sets that avoid catastrophically forgetting knowledge acquired from previously seen data while seamlessly integrating knowledge from new data. During inference, the likelihoods of the unknown test scenario are computed using internal model activation distributions. The inference made by each independent component is weighted by the normalized likelihood values to obtain the final decision.
引用
收藏
页码:1046 / 1050
页数:5
相关论文
共 50 条
  • [41] End-to-End Deep Learning Speech Recognition Model for Silent Speech Challenge
    Kimura, Naoki
    Su, Zixiong
    Saeki, Takaaki
    INTERSPEECH 2020, 2020, : 1025 - 1026
  • [42] An End-to-End Dialect Identification System with Transfer Learning from a Multilingual Automatic Speech Recognition Model
    Wang, Ding
    Ye, Shuaishuai
    Hu, Xinhui
    Li, Sheng
    Xu, Xinkang
    INTERSPEECH 2021, 2021, : 3266 - 3270
  • [43] End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
    Tanaka, Tomohiro
    Masumura, Ryo
    Ihori, Mana
    Takashima, Akihiko
    Orihashi, Shota
    Makishima, Naoki
    INTERSPEECH 2021, 2021, : 4458 - 4462
  • [44] IMPROVING UNSUPERVISED STYLE TRANSFER IN END-TO-END SPEECH SYNTHESIS WITH END-TO-END SPEECH RECOGNITION
    Liu, Da-Rong
    Yang, Chi-Yu
    Wu, Szu-Lin
    Lee, Hung-Yi
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 640 - 647
  • [45] SFA: Searching faster architectures for end-to-end automatic speech recognition models
    Liu, Yukun
    Li, Ta
    Zhang, Pengyuan
    Yan, Yonghong
    COMPUTER SPEECH AND LANGUAGE, 2023, 81
  • [46] AN END-TO-END APPROACH TO JOINT SOCIAL SIGNAL DETECTION AND AUTOMATIC SPEECH RECOGNITION
    Inaguma, Hirofumi
    Mimura, Masato
    Inoue, Koji
    Yoshii, Kazuyoshi
    Kawahara, Tatsuya
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6214 - 6218
  • [47] Improved Training for End-to-End Streaming Automatic Speech Recognition Model with Punctuation
    Kim, Hanbyul
    Seo, Seunghyun
    Lee, Lukas
    Baek, Seolki
    INTERSPEECH 2023, 2023, : 1653 - 1657
  • [48] Hardware Accelerator for Transformer based End-to-End Automatic Speech Recognition System
    Yamini, Shaarada D.
    Mirishkar, Ganesh S.
    Vuppala, Anil Kumar
    Purini, Suresh
    2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW, 2023, : 93 - 100
  • [49] Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture
    Miao, Haoran
    Cheng, Gaofeng
    Zhang, Pengyuan
    Yan, Yonghong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1452 - 1465
  • [50] AUDITORY-BASED DATA AUGMENTATION FOR END-TO-END AUTOMATIC SPEECH RECOGNITION
    Tu, Zehai
    Deadman, Jack
    Ma, Ning
    Barker, Jon
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7447 - 7451