Online Incremental Learning for Speaker-Adaptive Language Models

被引:0
|
作者
Hu, Chih Chi [1 ]
Liu, Bing [1 ]
Shen, John Paul [1 ]
Lane, Ian [1 ]
机构
[1] Carnegie Mellon Univ, Elect & Comp Engn, Pittsburgh, PA 15213 USA
关键词
Automatic Speech Recognition; Online Learning; Language Modeling; Speaker-Adaptation; Speaker Specific Modeling; Recurrent Neural Networks; ADAPTATION;
D O I
10.21437/Interspeech.2018-2259
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Voice control is a prominent interaction method on personal computing devices. While automatic speech recognition (ASR) systems are readily applicable for large audiences, there is room for further adaptation at the edge, ie. locally on devices, targeted for individual users. In this work, we explore improving ASR systems over time through a user's own interactions. Our online learning approach for speaker-adaptive language modeling leverages a user's most recent utterances to enhance the speaker dependent features and traits. We experiment with the Large Vocabulary Continuous Speech Recognition corpus Tedlium v2, and demonstrate an average reduction in perplexity (PPL) of 19.18% and average relative reduction in word error rate (WER) of 2.80% compared to a state-of-the-art baseline on Tedlium v2.
引用
收藏
页码:3363 / 3367
页数:5
相关论文
共 50 条
  • [41] Fast end-to-end non-parallel voice conversion based on speaker-adaptive neural vocoder with cycle-consistent learning
    Imai, Shuhei
    Kanagaki, Aoi
    Nose, Takashi
    Fukawa, Shogo
    Ito, Akinori
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2025, 46 (01) : 116 - 119
  • [42] BOOTSTRAPPING NON-PARALLEL VOICE CONVERSION FROM SPEAKER-ADAPTIVE TEXT-TO-SPEECH
    Luong, Hieu-Thi
    Yamagishi, Junichi
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 200 - 207
  • [43] HMM-based distributed text-to-speech synthesis incorporating speaker-adaptive training
    Jeon, Kwang Myung
    Choi, Seung Ho
    International Journal of Multimedia and Ubiquitous Engineering, 2014, 9 (05): : 107 - 119
  • [44] Learning migration models for supporting incremental language migrations of software applications
    Mateus, Bruno Gois
    Martinez, Matias
    Kolski, Christophe
    INFORMATION AND SOFTWARE TECHNOLOGY, 2023, 153
  • [45] Adaptive Deep Models for Incremental Learning: Considering Capacity Scalability and Sustainability
    Yang, Yang
    Zhou, Da-Wei
    Zhan, De-Chuan
    Xiong, Hui
    Jiang, Yuan
    KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 74 - 82
  • [46] ADAPTIVE AND ONLINE SPEAKER DIARIZATION FOR MEETING DATA
    Soldi, Giovanni
    Beaugeant, Christophe
    Evans, Nicholas
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2112 - 2116
  • [47] Fast end-to-end non-parallel voice conversion based on speaker-adaptive neural vocoder with cycle-consistent learning
    Imai, Shuhei
    Kanagaki, Aoi
    Nose, Takashi
    Fukawa, Shogo
    Ito, Akinori
    Acoustical Science and Technology, 46 (01): : 116 - 119
  • [48] DNN-BASED SPEAKER-ADAPTIVE POSTFILTERING WITH LIMITED ADAPTATION DATA FOR STATISTICAL SPEECH SYNTHESIS SYSTEMS
    Ozturk, Mirac Goksu
    Ulusoy, Okan
    Demiroglu, Cenk
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7030 - 7034
  • [49] Language learning, polylanguaging and speaker perspectives
    Ritzau, Ursula
    Madsen, Lian Malai
    APPLIED LINGUISTICS REVIEW, 2016, 7 (03) : 305 - 326
  • [50] Variable incremental adaptive learning model based on knowledge graph and its application in online learning system
    Bai Z.
    International Journal of Computers and Applications, 2022, 44 (07) : 650 - 658