Improved Language Models for ASR using Written Language Text

被引:1
|
作者
Mukherji, Kaustuv [1 ]
Pandharipande, Meghna [1 ]
Kopparapu, Sunil Kumar [1 ]
机构
[1] Tata Consultancy Serv Ltd, TCS Res, Mumbai, Maharashtra, India
关键词
Language Model; Speech Recognition; SPEECH;
D O I
10.1109/NCC55593.2022.9806803
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
The performance of an Automatic Speech Recognition (ASR) engine primarily depends on (a) the acoustic model (AM), (b) the language model (LM) and (c) the lexicon (Lx). While the contribution of each block to the overall performance of an ASR cannot be measured separately , a good LM helps in performance improvement in case of a domain specific ASR at a smaller cost. Generally, LM is greener compared to building AM and is much easier to build, for a domain specific ASR because it requires only domain specific text corpora. Traditionally, because of its ready availability, written language text (WLT) corpora has been used to build LM though there is an agreement that there a significant difference between WLT and spoken language text (SLT). In this paper, we explore methods and techniques that can be used to convert WLT into a form that realizes a better LM to support ASR performance.
引用
收藏
页码:362 / 366
页数:5
相关论文
共 50 条
  • [11] Lexicalized and statistical parsing of natural language text in Tamil using hybrid language models
    Selvam, M.
    Natarajan, A.M.
    Thangarajan, R.
    2008, WSEAS (07):
  • [12] The Impact of Specific Language Impairment on Adolescents' Written Text
    Dockrell, Julie E.
    Lindsay, Geoff
    Connelly, Vincent
    EXCEPTIONAL CHILDREN, 2009, 75 (04) : 427 - 446
  • [13] Aztec Written Language - Grammar (with Phonetics), Text and Glossary
    Hoeltker, Georg
    ANTHROPOS, 1950, 45 (4-6) : 914 - 915
  • [14] EDUCATING IN WRITTEN LANGUAGE, EDUCATING FOR WRITTEN LANGUAGE
    Eugenia Dubois, Maria
    LEGENDA, 2011, 15 (12): : 122 - 133
  • [15] GREEK ROMANY AS A WRITTEN LANGUAGE, A TEXT IN GREEK TRANSCRIPTION
    MESSING, GM
    JOURNAL OF MODERN GREEK STUDIES, 1991, 9 (01) : 83 - 92
  • [17] Language Models with RNNs for Rescoring Hypotheses of Russian ASR
    Kipyatkova, Irina
    Karpov, Alexey
    ADVANCES IN NEURAL NETWORKS - ISNN 2016, 2016, 9719 : 418 - 425
  • [18] LARGE MARGIN TRAINING IMPROVES LANGUAGE MODELS FOR ASR
    Wang, Jilin
    Huang, Jiaji
    Church, Kenneth Ward
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7368 - 7372
  • [19] Scalable Multi Corpora Neural Language Models for ASR
    Raju, Anirudh
    Filimonov, Denis
    Tiwari, Gautam
    Lan, Guitang
    Rastrow, Ariya
    INTERSPEECH 2019, 2019, : 3910 - 3914
  • [20] Integration of complex language models in ASR and LU systems
    Raquel Justo
    M. Inés Torres
    Pattern Analysis and Applications, 2015, 18 : 493 - 505