Efficient training of large neural networks for language modeling

被引:0
|
作者
Schwenk, H [1 ]
机构
[1] CNRS, LIMSI, F-91403 Orsay, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently there has been increasing interest in using neural networks for language modeling. In contrast to the well known backoff n-gram language models, the neural network approach tries to limit the data sparseness problem by performing the estimation in a continuous space, allowing by this means smooth interpolations. The complexity to train such a model and to calculate one n-gram probability is however several orders of magnitude higher than for the backoff models, making the new approach difficult to use in real applications. In this paper several techniques are presented that allow the use of a neural network language model in a large vocabulary speech recognition system, in particular very fast lattice rescoring and efficient training of large neural networks on training corpora of over 10 million words. The described approach achieves significant word error reductions with respect to a carefully tuned 4-gram backoff language model in a state of the art conversational speech recognizer for the DARPA rich transcriptions evaluations.
引用
收藏
页码:3059 / 3064
页数:6
相关论文
共 50 条
  • [31] Data-Efficient Augmentation for Training Neural Networks
    Liu, Tian Yu
    Mirzasoleiman, Baharan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [32] Efficient Training of Low-Curvature Neural Networks
    Srinivas, Suraj
    Matoba, Kyle
    Lakkaraju, Himabindu
    Fleuret, Francois
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [33] Accurate, efficient and scalable training of Graph Neural Networks
    Zeng, Hanqing
    Zhou, Hongkuan
    Srivastava, Ajitesh
    Kannan, Rajgopal
    Prasanna, Viktor
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2021, 147 : 166 - 183
  • [34] Efficient EM training algorithm for probability neural networks
    Xiong, Hanchun
    He, Qianhua
    Li, Haizhou
    Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 1998, 26 (07): : 25 - 32
  • [35] Efficient training of RBF neural networks for pattern recognition
    Lampariello, F
    Sciandrone, M
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2001, 12 (05): : 1235 - 1242
  • [36] Architectures of neural networks applied for LVCSR language modeling
    Gajecki, Leszek
    NEUROCOMPUTING, 2014, 133 : 46 - 53
  • [37] DIALOG CONTEXT LANGUAGE MODELING WITH RECURRENT NEURAL NETWORKS
    Liu, Bing
    Lane, Ian
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5715 - 5719
  • [38] Persian Language Modeling Using Recurrent Neural Networks
    Saravani, Seyed Habib Hosseini
    Bahrani, Mohammad
    Veisi, Hadi
    Besharati, Sara
    2018 9TH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST), 2018, : 207 - 210
  • [39] Artificial neural networks as a tool of modeling of training loads
    Rygula, I
    MODELLING AND CONTROL IN BIOMEDICAL SYSTEMS 2003 (INCLUDING BIOLOGICAL SYSTEMS), 2003, : 531 - 535
  • [40] Artificial neural networks as a tool of modeling of training loads
    Rygula, Igor
    2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2005, : 2985 - 2988