Large Vocabulary SOUL Neural Network Language Models

被引:0
|
作者
Le, Hai-Son [1 ,2 ]
Oparin, Ilya [2 ]
Messaoudi, Abdel [2 ]
Allauzen, Alexandre [1 ,2 ]
Gauvain, Jean-Luc [2 ]
Yvon, Francois [1 ,2 ]
机构
[1] Univ Paris 11, BP 133, F-91403 Orsay, France
[2] Spoken Language Proc Grp, CNRS, LIMSI, F-91403 Orsay, France
关键词
Neural Network Language Model; Automatic Speech Recognition; Speech-To-Text;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents continuation of research on Structured OUt-put Layer Neural Network language models (SOUL NNLM) for automatic speech recognition. As SOUL NNLMs allow estimating probabilities for all in-vocabulary words and not only for those pertaining to a limited shortlist, we investigate. its performance on a large-vocabulary task. Significant improvements both in perplexity and word error rate over conventional shortlist-based NNLMs are shown on a challenging Arabic GALE task characterized by a recognition vocabulary of about 300k entries. A new training scheme is proposed for SOUL NNLMs that is based on separate training of the out-of-shortlist part of the output layer. It enables using more data at each iteration of a neural network without any considerable slow-down in training and brings additional improvements in speech recognition performance.
引用
收藏
页码:1480 / +
页数:2
相关论文
共 50 条
  • [41] OPTIMIZATION OF NEURAL NETWORK LANGUAGE MODELS FOR KEYWORD SEARCH
    Gandhe, Ankur
    Metze, Florian
    Waibel, Alex
    Lane, Ian
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [42] COMPARISON OF FEEDFORWARD AND RECURRENT NEURAL NETWORK LANGUAGE MODELS
    Sundermeyer, M.
    Oparin, I.
    Gauvain, J. -L.
    Freiberg, B.
    Schlueter, R.
    Ney, H.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8430 - 8434
  • [43] FUTURE WORD CONTEXTS IN NEURAL NETWORK LANGUAGE MODELS
    Chen, X.
    Liu, X.
    Ragni, A.
    Wang, Y.
    Gales, M. J. F.
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 97 - 103
  • [44] Driving and suppressing the human language network using large language models
    Tuckute, Greta
    Sathe, Aalok
    Srikant, Shashank
    Taliaferro, Maya
    Wang, Mingye
    Schrimpf, Martin
    Kay, Kendrick
    Fedorenko, Evelina
    NATURE HUMAN BEHAVIOUR, 2024, 8 (03) : 544 - 561
  • [45] Driving and suppressing the human language network using large language models
    Greta Tuckute
    Aalok Sathe
    Shashank Srikant
    Maya Taliaferro
    Mingye Wang
    Martin Schrimpf
    Kendrick Kay
    Evelina Fedorenko
    Nature Human Behaviour, 2024, 8 : 544 - 561
  • [46] A Comparison of Deep Neural Network Training Methods for Large Vocabulary Speech Recognition
    Toth, Laszlo
    Grosz, Tamas
    TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 36 - 43
  • [47] LISTEN, ATTEND AND SPELL: A NEURAL NETWORK FOR LARGE VOCABULARY CONVERSATIONAL SPEECH RECOGNITION
    Chan, William
    Jaitly, Navdeep
    Quoc Le
    Vinyals, Oriol
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 4960 - 4964
  • [48] LMEye: An Interactive Perception Network for Large Language Models
    Li, Yunxin
    Hu, Baotian
    Chen, Xinyu
    Ma, Lin
    Xu, Yong
    Zhang, Min
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 10952 - 10964
  • [49] Creating word-level language models for large-vocabulary handwriting recognition
    John F. Pitrelli
    Amit Roy
    International Journal on Document Analysis and Recognition, 2003, 5 (2) : 126 - 137
  • [50] Beyond Textbooks: A Novel Workflow for Customized Vocabulary Sheet Generation with Large Language Models
    Ngoc-Sang Vo
    Ngoc-Thanh-Xuan Nguyen
    Tan-Phuoc Pham
    Hoang-Anh Pham
    INTELLIGENCE OF THINGS: TECHNOLOGIES AND APPLICATIONS, ICIT 2024, VOL 2, 2025, 230 : 208 - 220