Improved Language Models for ASR using Written Language Text

被引:1
|
作者
Mukherji, Kaustuv [1 ]
Pandharipande, Meghna [1 ]
Kopparapu, Sunil Kumar [1 ]
机构
[1] Tata Consultancy Serv Ltd, TCS Res, Mumbai, Maharashtra, India
关键词
Language Model; Speech Recognition; SPEECH;
D O I
10.1109/NCC55593.2022.9806803
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
The performance of an Automatic Speech Recognition (ASR) engine primarily depends on (a) the acoustic model (AM), (b) the language model (LM) and (c) the lexicon (Lx). While the contribution of each block to the overall performance of an ASR cannot be measured separately , a good LM helps in performance improvement in case of a domain specific ASR at a smaller cost. Generally, LM is greener compared to building AM and is much easier to build, for a domain specific ASR because it requires only domain specific text corpora. Traditionally, because of its ready availability, written language text (WLT) corpora has been used to build LM though there is an agreement that there a significant difference between WLT and spoken language text (SLT). In this paper, we explore methods and techniques that can be used to convert WLT into a form that realizes a better LM to support ASR performance.
引用
收藏
页码:362 / 366
页数:5
相关论文
共 50 条
  • [1] Improved Hybrid Streaming ASR with Transformer Language Models
    Baquero-Arnal, Pau
    Jorge, Javier
    Gimenez, Adria
    Albert Silvestre-Cerda, Joan
    Iranzo-Sanchez, Javier
    Sanchis, Albert
    Civera, Jorge
    Juan, Alfons
    INTERSPEECH 2020, 2020, : 2127 - 2131
  • [2] Using ASR-Generated Text for Spoken Language Modeling
    Herve, Nicolas
    Pelloin, Valentin
    Favre, Benoit
    Dary, Franck
    Laurent, Antoine
    Meignier, Sylvain
    Besacier, Laurent
    PROCEEDINGS OF WORKSHOP ON CHALLENGES & PERSPECTIVES IN CREATING LARGE LANGUAGE MODELS (BIGSCIENCE EPISODE #5), 2022, : 17 - 25
  • [3] Image and Text Correction Using Language Models
    Kissos, Ido
    Dershowitz, Nachum
    2017 1ST INTERNATIONAL WORKSHOP ON ARABIC SCRIPT ANALYSIS AND RECOGNITION (ASAR), 2017, : 158 - 162
  • [4] Spoken ≈ Written: communication, language, text.
    Bogoczova, Irena
    SLOVO A SLOVESNOST, 2011, 72 (02): : 140 - 145
  • [5] IMPROVING LANGUAGE MODELS FOR ASR USING TRANSLATED IN-DOMAIN DATA
    Kombrink, Stefan
    Mikolov, Tomas
    Karafiat, Martin
    Burget, Lukas
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4405 - 4408
  • [6] TOWARDS AN ASR APPROACH USING ACOUSTIC AND LANGUAGE MODELS FOR SPEECH ENHANCEMENT
    Nayem, Khandokar Md
    Williamson, Donald S.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7123 - 7127
  • [7] Improved phonotactic language identification using random forest language models
    Wang, XiaoRui
    Wang, ShiJin
    Liang, JiaEn
    Xu, Bo
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4237 - 4240
  • [8] Legal Text Analysis Using Large Language Models
    Arfat, Yasir
    Colella, Marco
    Marello, Enrico
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT II, NLDB 2024, 2024, 14763 : 258 - 268
  • [9] Portuguese text generation using factored language models
    Paraboni, I. (ivandre@usp.br), 1600, Springer London (19):
  • [10] USING THE WRITTEN LANGUAGE CREATIVELY
    ALLEN, VG
    MODERN LANGUAGE JOURNAL, 1978, 62 (08): : 407 - 410