UNNORMALIZED EXPONENTIAL AND NEURAL NETWORK LANGUAGE MODELS

被引:0
|
作者
Sethy, Abhinav [1 ]
Chen, Stanley [1 ]
Arisoy, Ebru [1 ]
Ramabhadran, Bhuvana [1 ]
机构
[1] IBM Corp, TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
Model M; unnormalized models; neural network language models; fast lookup;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Model M, an exponential class-based language model, and neural network language models (NNLM's) have outperformed word n-gram language models over a wide range of tasks. However, these gains come at the cost of vastly increased computation when calculating word probabilities. For both models, the bulk of this computation involves evaluating the softmax function over a large word or class vocabulary to ensure that probabilities sum to 1. In this paper, we study unnormalized variants of Model M and NNLM's, whereby the softmax function is simply omitted. Accordingly, model training must be modified to encourage scores to sum close to 1. In this paper, we demonstrate up to a factor of 35 faster n-gram lookups with unnormalized models over their normalized counterparts, while still yielding state-of-the-art performance in WER (10.2 on the English broadcast news rt04 set).
引用
收藏
页码:5416 / 5420
页数:5
相关论文
共 50 条
  • [1] Topic adaptation for language modeling using unnormalized exponential models
    Chen, SF
    Seymore, K
    Rosenfeld, R
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 681 - 684
  • [2] PARAPHRASTIC LANGUAGE MODELS AND COMBINATION WITH NEURAL NETWORK LANGUAGE MODELS
    Liu, X.
    Gales, M. J. F.
    Woodland, P. C.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8421 - 8425
  • [3] PARAPHRASTIC NEURAL NETWORK LANGUAGE MODELS
    Liu, X.
    Gales, M. J. F.
    Woodland, P. C.
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [4] Multi-Language Neural Network Language Models
    Ragni, Anton
    Dakin, Edgar
    Chen, Xie
    Gales, Mark J. F.
    Knill, Kate M.
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3042 - 3046
  • [5] Neural network classification in exponential models with unknown statistics
    Vajda, I
    CLASSIFICATION IN THE INFORMATION AGE, 1999, : 336 - 343
  • [6] Neural network models of category learning and language
    Cangelosi, A
    BRAIN AND COGNITION, 2003, 53 (02) : 106 - 107
  • [7] A NEURAL NETWORK APPROACH FOR MIXING LANGUAGE MODELS
    Oualil, Youssef
    Klakow, Dietrich
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5710 - 5714
  • [8] Autoassociative neural network models for language identification
    Mary, L
    Yegnanarayana, B
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON INTELLIGENT SENSING AND INFORMATION PROCESSING, 2004, : 317 - 320
  • [9] SCALING RECURRENT NEURAL NETWORK LANGUAGE MODELS
    Williams, Will
    Prasad, Niranjani
    Mrva, David
    Ash, Tom
    Robinson, Tony
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5391 - 5395
  • [10] Tracking Child Language Development With Neural Network Language Models
    Sagae, Kenji
    FRONTIERS IN PSYCHOLOGY, 2021, 12