Online Incremental Learning for Speaker-Adaptive Language Models

被引:0
|
作者
Hu, Chih Chi [1 ]
Liu, Bing [1 ]
Shen, John Paul [1 ]
Lane, Ian [1 ]
机构
[1] Carnegie Mellon Univ, Elect & Comp Engn, Pittsburgh, PA 15213 USA
关键词
Automatic Speech Recognition; Online Learning; Language Modeling; Speaker-Adaptation; Speaker Specific Modeling; Recurrent Neural Networks; ADAPTATION;
D O I
10.21437/Interspeech.2018-2259
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Voice control is a prominent interaction method on personal computing devices. While automatic speech recognition (ASR) systems are readily applicable for large audiences, there is room for further adaptation at the edge, ie. locally on devices, targeted for individual users. In this work, we explore improving ASR systems over time through a user's own interactions. Our online learning approach for speaker-adaptive language modeling leverages a user's most recent utterances to enhance the speaker dependent features and traits. We experiment with the Large Vocabulary Continuous Speech Recognition corpus Tedlium v2, and demonstrate an average reduction in perplexity (PPL) of 19.18% and average relative reduction in word error rate (WER) of 2.80% compared to a state-of-the-art baseline on Tedlium v2.
引用
收藏
页码:3363 / 3367
页数:5
相关论文
共 50 条
  • [11] Speaker-Adaptive Speech Recognition Based on Surface Electromyography
    Wand, Michael
    Schultz, Tanja
    BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES, 2010, 52 : 271 - 285
  • [12] UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data
    Kim, Heeseung
    Kim, Sungwon
    Yeom, Jiheum
    Yoon, Sungroh
    INTERSPEECH 2023, 2023, : 3038 - 3042
  • [13] Speaker-Adaptive Lip Reading with User-Dependent Padding
    Kim, Minsu
    Kim, Hyunjun
    Ro, Yong Man
    COMPUTER VISION, ECCV 2022, PT XXXVI, 2022, 13696 : 576 - 593
  • [14] TOWARDS SPEAKER-ADAPTIVE SPEECH RECOGNITION BASED ON SURFACE ELECTROMYOGRAPHY
    Wand, Michael
    Schultz, Tanja
    BIOSIGNALS 2009: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON BIO-INSPIRED SYSTEMS AND SIGNAL PROCESSING, 2009, : 155 - 162
  • [15] Adaptive Online Domain Incremental Continual Learning
    Gunasekara, Nuwan
    Gomes, Heitor
    Bifet, Albert
    Pfahringer, Bernhard
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT I, 2022, 13529 : 491 - 502
  • [16] Speaker-adaptive visual speech synthesis in the HMM-framework
    Schabus, Dietmar
    Pucher, Michael
    Hofer, Gregor
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 978 - 981
  • [17] EMOTIONS ARE A PERSONAL THING: TOWARDS SPEAKER-ADAPTIVE EMOTION RECOGNITION
    Sidorov, Maxim
    Ultes, Stefan
    Schmitt, Alexander
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [18] Speaker-Adaptive Neural Vocoders for Parametric Speech Synthesis Systems
    Song, Eunwoo
    Kim, Jin-Seob
    Byun, Kyungguen
    Kang, Hong-Goo
    2020 IEEE 22ND INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2020,
  • [19] Speaker-adaptive speech recognition using speaker diarization for improved transcription of large spoken archives
    Cerva, Petr
    Silovsky, Jan
    Zdansky, Jindrich
    Nouza, Jan
    Seps, Ladislav
    SPEECH COMMUNICATION, 2013, 55 (10) : 1033 - 1046
  • [20] Adaptive online incremental learning for evolving data streams
    Zhang, Si -si
    Liu, Jian-wei
    Zuo, Xin
    APPLIED SOFT COMPUTING, 2021, 105