Converting Continuous-Space Language Models into N-gram Language Models with Efficient Bilingual Pruning for Statistical Machine Translation

被引：4

作者：

Wang, Rui ^{[1
]}

Utiyama, Masao ^{[2
]}

Goto, Isao ^{[3
,4
]}

Sumita, Eiichiro ^{[2
]}

Zhao, Hai ^{[1
,5
]}

Lu, Bao-Liang ^{[1
,5
]}

机构：

[1] Shanghai Jiao Tong Univ, Ctr Brain Like Comp & Machine Intelligence, Dept Comp Sci & Engn, 800 Dongchuan Rd, Shanghai 200240, Peoples R China

[2] Natl Inst Informat & Commun Technol, Multilingual Translat Lab, 3-5 Hikaridai, Kyoto 6190289, Japan

[3] NHK Japan Broadcasting Corp, Sci & Technol Res Labs, Setagaya Ku, 1-10-11 Kinuta, Tokyo 1578510, Japan

[4] Natl Inst Informat & Commun Technol, Kyoto 6190289, Japan

[5] Shanghai Jiao Tong Univ, Key Lab Shanghai Educ Commiss Intelligent Interac, 800 Dongchuan Rd, Shanghai 200240, Peoples R China

来源：

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING | 2016年 / 15卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Machine translation; continuous-space language model; neural network language model; language model pruning;

D O I：

10.1145/2843942

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Language Model (LM) is an essential component of Statistical Machine Translation (SMT). In this article, we focus on developing efficient methods for LM construction. Our main contribution is that we propose a Natural N-grams based Converting (NNGC) method for transforming a Continuous-Space Language Model (CSLM) to a Back-off N-gram Language Model (BNLM). Furthermore, a Bilingual LM Pruning (BLMP) approach is developed for enhancing LMs in SMT decoding and speeding up CSLM converting. The proposed pruning and converting methods can convert a large LM efficiently by working jointly. That is, a LM can be effectively pruned before it is converted from CSLM without sacrificing performance, and further improved if an additional corpus contains out-of-domain information. For different SMT tasks, our experimental results indicate that the proposed NNGC and BLMP methods outperform the existing counterpart approaches significantly in BLEU and computational cost.

引用

页数：26

共 50 条

[41] Modified Chinese N-gram statistical language model
Tian, Bin
Tian, Hongxin
Yi, Kechu
Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2000, 27 (01): : 62 - 64
[42] New word clustering method for building n-gram language models in continuous speech recognition systems
Bahrani, Mohammad
Sameti, Hossein
Hafezi, Nazila
Momtazi, Saeedeh
NEW FRONTIERS IN APPLIED ARTIFICIAL INTELLIGENCE, 2008, 5027 : 286 - 293
[43] Efficient Weighted Edit Distance and N-gram Language Models to Improve Spelling Correction of Segmentation Errors
Gueddah, Hicham
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (12) : 934 - 939
[44] Efficient Combination of N-gram Language Models and Recognition Grammars in Real-Time LVCSR Decoder
Prazak, Ales
Ircing, Pavel
Svec, Jan
Psutka, Josef
ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 587 - +
[45] URL-Based Web Page Classification: With n-Gram Language Models
Abdallah, Tarek Amr
de La Iglesia, Beatriz
KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT, IC3K 2014, 2015, 553 : 19 - 33
[46] MLP emulation of N-gram models as a first step to connectionist language modeling
Castro, MJ
Prat, F
Casacuberta, F
NINTH INTERNATIONAL CONFERENCE ON ARTIFICIAL NEURAL NETWORKS (ICANN99), VOLS 1 AND 2, 1999, (470): : 910 - 915
[47] A Distributed System for Large-scale n-gram Language Models at Tencent
Long, Qiang
Wang, Wei
Deng, Jinfu
Liu, Song
Huang, Wenhao
Chen, Fangying
Liu, Sifan
PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 12 (12): : 2206 - 2217
[48] Continuous space language models
Schwenk, Holger
COMPUTER SPEECH AND LANGUAGE, 2007, 21 (03): : 492 - 518
[49] STATISTICAL N-GRAM INDEXING OF NATURAL-LANGUAGE DOCUMENTS
TEUFEL, B
INTERNATIONAL FORUM ON INFORMATION AND DOCUMENTATION, 1988, 13 (04): : 3 - 10
[50] Splitting input for machine translation using N-gram language model together with utterance similarity
Doi, T
Sumita, E
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (06): : 1256 - 1264

← 1 2 3 4 5 →