Pivot language approach for phrase-based statistical machine translation

被引:49
|
作者
Wu, Hua [1 ]
Wang, Haifeng [1 ]
机构
[1] Toshiba China Res & Dev Ctr, 501,Tower W2,Oriental Plaza,1,East Chang An Ave, Beijing 100738, Peoples R China
关键词
Pivot language; Phrase-based statistical machine translation; Scarce bilingual resources;
D O I
10.1007/s10590-008-9041-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a novel method for phrase-based statistical machine translation based on the use of a pivot language. To translate between languages L-s and L-t with limited bilingual resources, we bring in a third language, L-p, called the pivot language. For the language pairs L-s - L-p and L-p - L-t, there exist large bilingual corpora. Using only L-s - L-p and L-p- L-t bilingual corpora, we can build a translation model for L-s - L-t. The advantage of this method lies in the fact that we can perform translation between L-s and L-t even if there is no bilingual corpus available for this language pair. Using BLEU as a metric, our pivot language approach significantly outperforms the standard model trained on a small bilingual corpus. Moreover, with a small L-s - L-t bilingual corpus available, our method can further improve translation quality by using the additional L-s - L-p and L-p - L-t bilingual corpora.
引用
收藏
页码:165 / 181
页数:17
相关论文
共 50 条
  • [31] Using TectoMT as a Preprocessing Tool for Phrase-Based Statistical Machine Translation
    Zeman, Daniel
    TEXT, SPEECH AND DIALOGUE, 2010, 6231 : 216 - 223
  • [32] Translation paraphrases in phrase-based machine translation
    Guzman, Francisco
    Garrido, Leonardo
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2008, 4919 : 388 - 398
  • [33] Exploiting Parallel Treebanks to Improve Phrase-Based Statistical Machine Translation
    Tinsley, John
    Hearne, Mary
    Way, Andy
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2009, 5449 : 318 - 331
  • [34] Linguistic Resources for Factored Phrase-Based Statistical Machine Translation Systems
    Navlea, Mirabela
    Todirascu, Amalia
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : H41 - H48
  • [35] Learning Word Reorderings for Hierarchical Phrase-based Statistical Machine Translation
    Zhang, Jingyi
    Utiyama, Masao
    Sumita, Eiichro
    Zhao, Hai
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 542 - 548
  • [36] Malayalam Natural Language Processing: Challenges in Building a Phrase-Based Statistical Machine Translation System
    Sebastian, Mary Priya
    Kumar, G. Santhosh
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (04)
  • [37] Parts of Speech Tagged Phrase-Based Statistical Machine Translation System for English → Mizo Language
    Devi C.S.
    Roy A.K.
    Purkayastha B.S.
    SN Computer Science, 4 (6)
  • [38] A Semi-supervised Approach to Bengali-English Phrase-Based Statistical Machine Translation
    Roy, Maxim
    ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2009, 5549 : 291 - +
  • [39] A unified framework and models for integrating translation memory into phrase-based statistical machine translation
    Liu, Yang
    Wang, Kun
    Zong, Chengqing
    Su, Keh-Yih
    COMPUTER SPEECH AND LANGUAGE, 2019, 54 : 176 - 206
  • [40] A Phrase-Based Approach based on morphological information for Japanese-Uighur Statistical Machine Translation System
    Nimaiti, Maimitili
    Izumi, Yamamoto
    PROCEEDINGS OF THE 2013 12TH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI CC 2013), 2013, : 133 - 136