Pivot language approach for phrase-based statistical machine translation

被引:49
|
作者
Wu, Hua [1 ]
Wang, Haifeng [1 ]
机构
[1] Toshiba China Res & Dev Ctr, 501,Tower W2,Oriental Plaza,1,East Chang An Ave, Beijing 100738, Peoples R China
关键词
Pivot language; Phrase-based statistical machine translation; Scarce bilingual resources;
D O I
10.1007/s10590-008-9041-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a novel method for phrase-based statistical machine translation based on the use of a pivot language. To translate between languages L-s and L-t with limited bilingual resources, we bring in a third language, L-p, called the pivot language. For the language pairs L-s - L-p and L-p - L-t, there exist large bilingual corpora. Using only L-s - L-p and L-p- L-t bilingual corpora, we can build a translation model for L-s - L-t. The advantage of this method lies in the fact that we can perform translation between L-s and L-t even if there is no bilingual corpus available for this language pair. Using BLEU as a metric, our pivot language approach significantly outperforms the standard model trained on a small bilingual corpus. Moreover, with a small L-s - L-t bilingual corpus available, our method can further improve translation quality by using the additional L-s - L-p and L-p - L-t bilingual corpora.
引用
收藏
页码:165 / 181
页数:17
相关论文
共 50 条
  • [41] Phrase-Based & Neural Unsupervised Machine Translation
    Lample, Guillaume
    Ott, Myle
    Conneau, Alexis
    Denoyer, Ludovic
    Ranzato, Marc'Aurelio
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 5039 - 5049
  • [42] A reordering model for phrase-based machine translation
    Nguyen, Vinh Van
    Nguyen, Thai Phuong
    Shimazu, Akira
    Nguyen, Minh Le
    ADVANCES IN NATURAL LANGUAGE PROCESSING, PROCEEDINGS, 2008, 5221 : 476 - +
  • [43] A vector-space dynamic feature for phrase-based statistical machine translation
    Costa-jussa, Marta R.
    Banchs, Rafael E.
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2011, 37 (02) : 139 - 154
  • [44] A vector-space dynamic feature for phrase-based statistical machine translation
    Marta R. Costa-jussà
    Rafael E. Banchs
    Journal of Intelligent Information Systems, 2011, 37 : 139 - 154
  • [45] Online adaptation to post-edits for phrase-based statistical machine translation
    Bertoldi, Nicola
    Simianer, Patrick
    Cettolo, Mauro
    Waeschle, Katharina
    Federico, Marcello
    Riezler, Stefan
    MACHINE TRANSLATION, 2014, 28 (3-4) : 309 - 339
  • [46] A general framework to deal with the scaling problem in phrase-based statistical machine translation
    Ortiz, Daniel
    Varea, Ismael Garcia
    Casacuberta, Francisco
    PATTERN RECOGNITION AND IMAGE ANALYSIS, PT 2, PROCEEDINGS, 2007, 4478 : 314 - +
  • [47] Pharaoh: A beam search decoder for phrase-based statistical machine translation models
    Koehn, P
    MACHINE TRANSLATION: FROM REAL USERS TO RESEARCH, PROCEEDINGS, 2004, 3265 : 115 - 124
  • [48] Learning local word reorderings for hierarchical phrase-based statistical machine translation
    Zhang, Jingyi
    Utiyama, Masao
    Sumita, Eiichro
    Zhao, Hai
    Neubig, Graham
    Nakamura, Satoshi
    MACHINE TRANSLATION, 2016, 30 (1-2) : 1 - 18
  • [49] Phrase-Based Machine Translation based on Simulated Annealing
    Lavecchia, Caroline
    Langlois, David
    Smaili, Kamel
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 3123 - 3129
  • [50] Using collocation segmentation to extract translation units in a phrase-based statistical machine translation system
    Costa-jussa, Marta R.
    Daudaravicius, Vidas
    Banchs, Rafael E.
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2010, (45): : 215 - 220