SEMI-SUPERVISED TRAINING FOR END-TO-END MODELS VIA WEAK DISTILLATION

被引:0
|
作者
Li, Bo [1 ]
Sainath, Tara N. [1 ]
Pang, Ruoming [1 ]
Wu, Zelin [1 ]
机构
[1] Google LLC, Mountain View, CA 94043 USA
关键词
semi-supervised training; sequence to sequence;
D O I
10.1109/icassp.2019.8682172
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
End-to-end (E2E) models are a promising research direction in speech recognition, as the single all-neural E2E system offers a much simpler and more compact solution compared to a conventional model, which has a separate acoustic (AM), pronunciation (PM) and language model (LM). However, it has been noted that E2E models perform poorly on tail words and proper nouns, likely because the end-to-end optimization requires joint audio-text pairs, and does not take advantage of additional lexicons and large amounts of text-only data used to train the LMs in conventional models. There has been numerous efforts in training an RNN-LM on text-only data and fusing it into the end-to-end model. In this work, we contrast this approach to training the E2E model with audio-text pairs generated from unsupervised speech data. To target the proper noun issue specifically, we adopt a Part-of-Speech (POS) tagger to filter the unsupervised data to use only those with proper nouns. We show that training with filtered unsupervised-data provides up to a 13% relative reduction in word-error-rate (WER), and when used in conjunction with a cold-fusion RNN-LM, up to a 17% relative improvement.
引用
收藏
页码:2837 / 2841
页数:5
相关论文
共 50 条
  • [31] SEMI-SUPERVISED END-TO-END SPEECH RECOGNITION USING TEXT-TO-SPEECH AND AUTOENCODERS
    Karita, Shigeki
    Watanabe, Shinji
    Iwata, Tomoharu
    Delcroix, Marc
    Ogawa, Atsunori
    Nakatani, Tomohiro
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6166 - 6170
  • [32] SEMI-SUPERVISED TRANSFER LEARNING FOR LANGUAGE EXPANSION OF END-TO-END SPEECH RECOGNITION MODELS TO LOW-RESOURCE LANGUAGES
    Kim, Jiyeon
    Kumar, Mehul
    Gowda, Dhananjaya
    Garg, Abhinav
    Kim, Chanwoo
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 984 - 988
  • [33] Real-Time End-to-End Vehicle and Landmark Localization Based on Semi-Supervised Learning
    Xiao, Nengfei
    Xiong, Zhongxia
    Ma, Yalong
    Wu, Xinkai
    CICTP 2023: INNOVATION-EMPOWERED TECHNOLOGY FOR SUSTAINABLE, INTELLIGENT, DECARBONIZED, AND CONNECTED TRANSPORTATION, 2023, : 268 - 278
  • [34] EMOVA: A Semi-supervised End-to-End Moving-Window Attentive Framework for Aspect Mining
    Li, Ning
    Chow, Chi-Yin
    Zhang, Jia-Dong
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT II, 2020, 12085 : 811 - 823
  • [35] End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
    Tanaka, Tomohiro
    Masumura, Ryo
    Ihori, Mana
    Takashima, Akihiko
    Orihashi, Shota
    Makishima, Naoki
    INTERSPEECH 2021, 2021, : 4458 - 4462
  • [36] Tic action recognition for children tic disorder with end-to-end video semi-supervised learning
    Wang, Xiangyang
    Yang, Kun
    Ding, Qiang
    Wang, Rui
    Sun, Jinhua
    VISUAL COMPUTER, 2025,
  • [37] End-To-End Graph-Based Deep Semi-Supervised Learning with Extended Graph Laplacian
    Wang, Zihao
    Tu, Enmei
    Zhou, Meng
    Yang, Jie
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 5948 - 5953
  • [38] Dialect-aware Semi-supervised Learning for End-to-End Multi-dialect Speech Recognition
    Shiota, Sayaka
    Imaizumi, Ryo
    Masumura, Ryo
    Kiya, Hitoshi
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 240 - 244
  • [39] Boundary-refined prototype generation: A general end-to-end paradigm for semi-supervised semantic segmentation
    Dong, Junhao
    Meng, Zhu
    Liu, Delong
    Liu, Jiaxuan
    Zhao, Zhicheng
    Su, Fei
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137
  • [40] Seq3seq Fingerprint: Towards End-to-end Semi-supervised Deep Drug Discovery
    Zhang, Xiaoyu
    Wang, Sheng
    Zhu, Feiyun
    Xu, Zheng
    Wang, Yuhong
    Huang, Junzhou
    ACM-BCB'18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2018, : 404 - 413