Converting Continuous-Space Language Models into N-gram Language Models with Efficient Bilingual Pruning for Statistical Machine Translation

被引：4

作者：

Wang, Rui ^{[1
]}

Utiyama, Masao ^{[2
]}

Goto, Isao ^{[3
,4
]}

Sumita, Eiichiro ^{[2
]}

Zhao, Hai ^{[1
,5
]}

Lu, Bao-Liang ^{[1
,5
]}

机构：

[1] Shanghai Jiao Tong Univ, Ctr Brain Like Comp & Machine Intelligence, Dept Comp Sci & Engn, 800 Dongchuan Rd, Shanghai 200240, Peoples R China

[2] Natl Inst Informat & Commun Technol, Multilingual Translat Lab, 3-5 Hikaridai, Kyoto 6190289, Japan

[3] NHK Japan Broadcasting Corp, Sci & Technol Res Labs, Setagaya Ku, 1-10-11 Kinuta, Tokyo 1578510, Japan

[4] Natl Inst Informat & Commun Technol, Kyoto 6190289, Japan

[5] Shanghai Jiao Tong Univ, Key Lab Shanghai Educ Commiss Intelligent Interac, 800 Dongchuan Rd, Shanghai 200240, Peoples R China

来源：

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING | 2016年 / 15卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Machine translation; continuous-space language model; neural network language model; language model pruning;

D O I：

10.1145/2843942

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Language Model (LM) is an essential component of Statistical Machine Translation (SMT). In this article, we focus on developing efficient methods for LM construction. Our main contribution is that we propose a Natural N-grams based Converting (NNGC) method for transforming a Continuous-Space Language Model (CSLM) to a Back-off N-gram Language Model (BNLM). Furthermore, a Bilingual LM Pruning (BLMP) approach is developed for enhancing LMs in SMT decoding and speeding up CSLM converting. The proposed pruning and converting methods can convert a large LM efficiently by working jointly. That is, a LM can be effectively pruned before it is converted from CSLM without sacrificing performance, and further improved if an additional corpus contains out-of-domain information. For different SMT tasks, our experimental results indicate that the proposed NNGC and BLMP methods outperform the existing counterpart approaches significantly in BLEU and computational cost.

引用

页数：26

共 50 条

[21] N-gram language models for offline handwritten text recognition
Zimmermann, M
Bunke, H
NINTH INTERNATIONAL WORKSHOP ON FRONTIERS IN HANDWRITING RECOGNITION, PROCEEDINGS, 2004, : 203 - 208
[22] N-gram Counts and Language Models from the Common Crawl
Buck, Christian
Heafield, Kenneth
van Ooyen, Bas
LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 3579 - 3584
[23] On the N-gram Approximation of Pre-trained Language Models
Krishnan, Aravind
Alabi, Jesujoba O.
Klakow, Dietrich
INTERSPEECH 2023, 2023, : 371 - 375
[24] Character n-Gram Embeddings to Improve RNN Language Models
Takase, Sho
Suzuki, Jun
Nagata, Masaaki
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5074 - 5082
[25] Learning N-gram Language Models from Uncertain Data
Kuznetsov, Vitaly
Liao, Hank
Mohri, Mehryar
Riley, Michael
Roark, Brian
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2323 - 2327
[26] Variable-length category n-gram language models
Niesler, TR
Woodland, PC
COMPUTER SPEECH AND LANGUAGE, 1999, 13 (01): : 99 - 124
[27] Modeling actions of PubMed users with n-gram language models
Lin, Jimmy
Wilbur, W. John
INFORMATION RETRIEVAL, 2009, 12 (04): : 487 - 503
[28] Language Identification of Short Text Segments with N-gram Models
Vatanen, Tommi
Vayrynen, Jaakko J.
Virpioja, Sami
LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 3423 - 3430
[29] Rich Morphology Based N-gram Language Models for Arabic
Emami, Ahmad
Zitouni, Imed
Mangu, Lidia
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 829 - 832
[30] Modeling actions of PubMed users with n-gram language models
Jimmy Lin
W. John Wilbur
Information Retrieval, 2009, 12 : 487 - 503

← 1 2 3 4 5 →