Improved Language Models for ASR using Written Language Text

被引:1
|
作者
Mukherji, Kaustuv [1 ]
Pandharipande, Meghna [1 ]
Kopparapu, Sunil Kumar [1 ]
机构
[1] Tata Consultancy Serv Ltd, TCS Res, Mumbai, Maharashtra, India
关键词
Language Model; Speech Recognition; SPEECH;
D O I
10.1109/NCC55593.2022.9806803
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
The performance of an Automatic Speech Recognition (ASR) engine primarily depends on (a) the acoustic model (AM), (b) the language model (LM) and (c) the lexicon (Lx). While the contribution of each block to the overall performance of an ASR cannot be measured separately , a good LM helps in performance improvement in case of a domain specific ASR at a smaller cost. Generally, LM is greener compared to building AM and is much easier to build, for a domain specific ASR because it requires only domain specific text corpora. Traditionally, because of its ready availability, written language text (WLT) corpora has been used to build LM though there is an agreement that there a significant difference between WLT and spoken language text (SLT). In this paper, we explore methods and techniques that can be used to convert WLT into a form that realizes a better LM to support ASR performance.
引用
收藏
页码:362 / 366
页数:5
相关论文
共 50 条
  • [31] Towards Improved Classification Accuracy on Highly Imbalanced Text Dataset Using Deep Neural Language Models
    Shaikh, Sarang
    Daudpota, Sher Muhammad
    Imran, Ali Shariq
    Kastrati, Zenun
    APPLIED SCIENCES-BASEL, 2021, 11 (02): : 1 - 20
  • [32] Evaluating Text GANs as Language Models
    Tevet, Guy
    Habib, Gavriel
    Shwartz, Vered
    Berant, Jonathan
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2241 - 2247
  • [33] Leveraging Frozen Pretrained Written Language Models for Neural Sign Language Translation
    De Coster, Mathieu
    Dambre, Joni
    INFORMATION, 2022, 13 (05)
  • [35] CONTEXT IN TEXT - THE DEVELOPMENT OF ORAL AND WRITTEN LANGUAGE IN 2 GENRES
    PELLEGRINI, AD
    GALDA, L
    RUBIN, DL
    CHILD DEVELOPMENT, 1984, 55 (04) : 1549 - 1555
  • [36] Language independent optical character recognition for hand written text
    Ali, A
    Ahmad, M
    Rafiq, N
    Akber, J
    Ahmad, U
    Akmal, S
    INMIC 2004: 8th International Multitopic Conference, Proceedings, 2004, : 79 - 84
  • [37] The effect of written text on comprehension of spoken English as a foreign language
    Diao, Yali
    Chandler, Paul
    Sweller, John
    AMERICAN JOURNAL OF PSYCHOLOGY, 2007, 120 (02): : 237 - 261
  • [38] Bilingualism in ancient society. Language contact and the written text
    Dickey, E
    JOURNAL OF ROMAN STUDIES, 2003, 93 : 294 - 302
  • [39] Constraints in the production of written text in children with specific language impairments
    Dockrell, Julie E.
    Lindsay, Geoff
    Connelly, Vincent
    Mackie, Clare
    EXCEPTIONAL CHILDREN, 2007, 73 (02) : 147 - 164
  • [40] Phrase classes in two-level language models for ASR
    Raquel Justo
    M. Inés Torres
    Pattern Analysis and Applications, 2009, 12 : 427 - 437