Efficient training of large neural networks for language modeling

被引：0

作者：

Schwenk, H ^{[1
]}

机构：

[1] CNRS, LIMSI, F-91403 Orsay, France

来源：

2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS | 2004年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently there has been increasing interest in using neural networks for language modeling. In contrast to the well known backoff n-gram language models, the neural network approach tries to limit the data sparseness problem by performing the estimation in a continuous space, allowing by this means smooth interpolations. The complexity to train such a model and to calculate one n-gram probability is however several orders of magnitude higher than for the backoff models, making the new approach difficult to use in real applications. In this paper several techniques are presented that allow the use of a neural network language model in a large vocabulary speech recognition system, in particular very fast lattice rescoring and efficient training of large neural networks on training corpora of over 10 million words. The described approach achieves significant word error reductions with respect to a carefully tuned 4-gram backoff language model in a state of the art conversational speech recognizer for the DARPA rich transcriptions evaluations.

引用

页码：3059 / 3064

页数：6

共 50 条

[31] Data-Efficient Augmentation for Training Neural Networks
Liu, Tian Yu
Mirzasoleiman, Baharan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[32] Efficient Training of Low-Curvature Neural Networks
Srinivas, Suraj
Matoba, Kyle
Lakkaraju, Himabindu
Fleuret, Francois
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[33] Accurate, efficient and scalable training of Graph Neural Networks
Zeng, Hanqing
Zhou, Hongkuan
Srivastava, Ajitesh
Kannan, Rajgopal
Prasanna, Viktor
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2021, 147 : 166 - 183
[34] Efficient EM training algorithm for probability neural networks
Xiong, Hanchun
He, Qianhua
Li, Haizhou
Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 1998, 26 (07): : 25 - 32
[35] Efficient training of RBF neural networks for pattern recognition
Lampariello, F
Sciandrone, M
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2001, 12 (05): : 1235 - 1242
[36] Architectures of neural networks applied for LVCSR language modeling
Gajecki, Leszek
NEUROCOMPUTING, 2014, 133 : 46 - 53
[37] DIALOG CONTEXT LANGUAGE MODELING WITH RECURRENT NEURAL NETWORKS
Liu, Bing
Lane, Ian
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5715 - 5719
[38] Persian Language Modeling Using Recurrent Neural Networks
Saravani, Seyed Habib Hosseini
Bahrani, Mohammad
Veisi, Hadi
Besharati, Sara
2018 9TH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST), 2018, : 207 - 210
[39] Artificial neural networks as a tool of modeling of training loads
Rygula, I
MODELLING AND CONTROL IN BIOMEDICAL SYSTEMS 2003 (INCLUDING BIOLOGICAL SYSTEMS), 2003, : 531 - 535
[40] Artificial neural networks as a tool of modeling of training loads
Rygula, Igor
2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2005, : 2985 - 2988

← 1 2 3 4 5 →