Linguistic-Relationships-Based Approach for Improving Word Alignment

被引:6
|
作者
Phuoc Tran [1 ]
Dien Dinh [2 ]
Tan Le [3 ]
Nguyen, Long H. B. [2 ]
机构
[1] Ton Duc Thang Univ, Fac Informat Technol, NLP KD Lab, Ho Chi Minh City, Vietnam
[2] VNU Univ Sci, Fac Informat Technol, Ho Chi Minh City, Vietnam
[3] Univ Quebec, Fac Informat Technol, Montreal, PQ, Canada
关键词
Word alignment; linguistic relationships; Chinese-Vietnamese machine translation; Sino-Vietnamese; content word;
D O I
10.1145/3133323
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The unsupervised word alignments (such as GIZA++) are widely used in the phrase-based statistical machine translation. The quality of the model is proportional to the size and the quality of the bilingual corpus. However, for low-resource language pairs such as Chinese and Vietnamese, a result of unsupervised word alignment sometimes is of low quality due to the sparse data. In addition, this model does not take advantage of the linguistic relationships to improve performance of word alignment. Chinese and Vietnamese have the same language type and have close linguistic relationships. In this article, we integrate the characteristics of linguistic relationships into the word alignment model to enhance the quality of Chinese-Vietnamese word alignment. These linguistic relationships are Sino-Vietnamese and content word. The experimental results showed that our method improved the performance of word alignment as well as the quality of machine translation.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Linguistic knowledge in statistical phrase-based word alignment
    TALP Research Center, Universitat Politècnica de Catalunya , Jordi Girona 1-3, 08034 Barcelona, Spain
    Nat Lang Eng, 2006, 1 (91-108):
  • [2] Chinese-Korean word alignment based on linguistic comparison
    Huang, JX
    Choi, KS
    38TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2000, : 392 - 399
  • [3] Enriching Word Alignment with Linguistic Tags
    Li, Xuansong
    Ge, Niyu
    Grimes, Stephen
    Strassel, Stephanie M.
    Maeda, Kazuaki
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 2189 - 2195
  • [4] A class-based approach to word alignment
    Ker, SJ
    Chang, JS
    COMPUTATIONAL LINGUISTICS, 1997, 23 (02) : 313 - 343
  • [5] Improving Word Alignment for Statistical Machine Translation based on Constraints
    Le Quang Hung
    Le Anh Cuong
    2012 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2012), 2012, : 113 - 116
  • [6] Improving Word Alignment Using Alignment of Deep Structures
    Marecek, David
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2009, 5729 : 56 - 63
  • [7] Bilingual lexical extraction based on word alignment for improving corpus search
    Andonovski, Jelena
    Sandrih, Branislava
    Kitanovic, Olivera
    ELECTRONIC LIBRARY, 2019, 37 (04): : 722 - 739
  • [8] Chinese-Vietnamese Word Alignment Method Based on Bidirectional RNN and Linguistic Features
    Gao, Shengxiang
    Zhu, Haodong
    Wang, Zhuo
    Yu, Zhengtao
    Wang, Xiaohan
    COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING, CHINESECSCW 2018, 2019, 917 : 454 - 465
  • [9] Improving neural sentence alignment with word translation
    Ding, Ying
    Li, Junhui
    Gong, Zhengxian
    Zhou, Guodong
    Frontiers of Computer Science, 2021, 15 (01):
  • [10] Improving neural sentence alignment with word translation
    Ying DING
    Junhui LI
    Zhengxian GONG
    Guodong ZHOU
    Frontiers of Computer Science, 2021, (01) : 66 - 75