Text Augmentation for Language Models in High Error Recognition Scenario

被引:2
|
作者
Benes, Karel [1 ]
Burget, Lukas [1 ]
机构
[1] Brno Univ Technol, Brno, Czech Republic
来源
基金
美国国家科学基金会;
关键词
data augmentation; error simulation; language modeling; automatic speech recognition;
D O I
10.21437/Interspeech.2021-627
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In this paper, we explore several data augmentation strategies for training of language models for speech recognition. We compare augmentation based on global error statistics with one based on unigram statistics of ASR errors and with labelsmoothing and its sampled variant. Additionally, we investigate the stability and the predictive power of perplexity estimated on augmented data. Despite being trivial, augmentation driven by global substitution, deletion and insertion rates achieves the best rescoring results. On the other hand, even though the associated perplexity measure is stable, it gives no better prediction of the final error rate than the vanilla one. Our best augmentation scheme increases the WER improvement from second-pass rescoring from 1.1% to 1.9% absolute on the CHiMe-6 challenge.
引用
收藏
页码:1872 / 1876
页数:5
相关论文
共 50 条
  • [1] Evaluation and Analysis of Large Language Models for Clinical Text Augmentation and Generation
    Latif, Atif
    Kim, Jihie
    IEEE ACCESS, 2024, 12 : 48987 - 48996
  • [2] Text-to-SQL Error Correction with Language Models of Code
    Chen, Ziru
    Chen, Shijie
    White, Michael
    Mooney, Raymond
    Payani, Ali
    Srinivasa, Jayanth
    Su, Yu
    Sun, Huan
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1359 - 1372
  • [3] Text Data Augmentation for the Korean Language
    Dang Thanh Vu
    Yu, Gwanghyun
    Lee, Chilwoo
    Kim, Jinyoung
    APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [4] Data Augmentation for Scene Text Recognition
    Atienza, Rowel
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 1561 - 1570
  • [5] Neural Error Corrective Language Models for Automatic Speech Recognition
    Tanaka, Tomohiro
    Masumura, Ryo
    Masataki, Hirokazu
    Aono, Yushi
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 401 - 405
  • [6] On the influence of vocabulary size and language models in unconstrained handwritten text recognition
    Marti, UV
    Bunke, H
    SIXTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, PROCEEDINGS, 2001, : 260 - 265
  • [7] Evaluation of Neural Network Language Models In Handwritten Chinese Text Recognition
    Wu, Yi-Chao
    Yin, Fei
    Liu, Cheng-Lin
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 166 - 170
  • [8] N-gram language models for offline handwritten text recognition
    Zimmermann, M
    Bunke, H
    NINTH INTERNATIONAL WORKSHOP ON FRONTIERS IN HANDWRITING RECOGNITION, PROCEEDINGS, 2004, : 203 - 208
  • [9] Adapting Code-Switching Language Models with Statistical-Based Text Augmentation
    Prachaseree, Chaiyasait
    Gupta, Kshitij
    Thi Nga Ho
    Peng, Yizhou
    Tun, Kyaw Zin
    Chng, Eng Siong
    Chalapthi, G. S. S.
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2023, PT II, 2023, 13996 : 310 - 322
  • [10] GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
    Yoo, Kang Min
    Park, Dongju
    Kang, Jaewook
    Lee, Sang-Woo
    Park, Woomyeong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 2225 - 2239