Linguistic-Relationships-Based Approach for Improving Word Alignment

被引:6
|
作者
Phuoc Tran [1 ]
Dien Dinh [2 ]
Tan Le [3 ]
Nguyen, Long H. B. [2 ]
机构
[1] Ton Duc Thang Univ, Fac Informat Technol, NLP KD Lab, Ho Chi Minh City, Vietnam
[2] VNU Univ Sci, Fac Informat Technol, Ho Chi Minh City, Vietnam
[3] Univ Quebec, Fac Informat Technol, Montreal, PQ, Canada
关键词
Word alignment; linguistic relationships; Chinese-Vietnamese machine translation; Sino-Vietnamese; content word;
D O I
10.1145/3133323
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The unsupervised word alignments (such as GIZA++) are widely used in the phrase-based statistical machine translation. The quality of the model is proportional to the size and the quality of the bilingual corpus. However, for low-resource language pairs such as Chinese and Vietnamese, a result of unsupervised word alignment sometimes is of low quality due to the sparse data. In addition, this model does not take advantage of the linguistic relationships to improve performance of word alignment. Chinese and Vietnamese have the same language type and have close linguistic relationships. In this article, we integrate the characteristics of linguistic relationships into the word alignment model to enhance the quality of Chinese-Vietnamese word alignment. These linguistic relationships are Sino-Vietnamese and content word. The experimental results showed that our method improved the performance of word alignment as well as the quality of machine translation.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Improving Thai Word and Sentence Segmentation Using Linguistic Knowledge
    Nararatwong, Rungsiman
    Kertkeidkachorn, Natthawut
    Cooharojananone, Nagul
    Okada, Hitoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (12): : 3218 - 3225
  • [22] Word synonym relationships for text analysis: A graph-based approach
    Alrasheed, Hend
    PLOS ONE, 2021, 16 (07):
  • [23] A Bidirectional Transformer Based Alignment Model for Unsupervised Word Alignment
    Zhang, Jingyi
    van Genabith, Josef
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 283 - 292
  • [24] A Word Segmentation Method of Ancient Chinese Based on Word Alignment
    Che, Chao
    Zhao, Hanyu
    Wu, Xiaoting
    Zhou, Dongsheng
    Zhang, Qiang
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING (NLPCC 2019), PT I, 2019, 11838 : 761 - 772
  • [25] Dependency Based Bilingual word Embeddings without word alignment
    Alqaisi, Taghreed
    Komninos, Alexandros
    O'Keefe, Simon
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [26] A Hybrid Approach for Word Alignment with Statistical Modeling and Chunker
    Srivastava, Jyoti
    Sanyal, Sudip
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT I, 2015, 9041 : 570 - 581
  • [27] IMPROVING ACHIEVEMENT IN BEGINNING READING - LINGUISTIC APPROACH
    SABAROFF, RE
    READING TEACHER, 1970, 23 (06): : 523 - 527
  • [28] Improving domain-specific word alignment with a general bilingual corpus
    Wu, H
    Wang, HF
    MACHINE TRANSLATION: FROM REAL USERS TO RESEARCH, PROCEEDINGS, 2004, 3265 : 262 - 271
  • [29] Improving Named Entity Recognition using Bilingual Constraints and Word Alignment
    Dao, An T.
    Truong, Thinh H.
    Nguyen, Long
    Dinh, Dien
    2018 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE APPLICATIONS AND TECHNOLOGIES (AIAAT 2018), 2018, 435
  • [30] The word order of negation in the history of Basque A linguistic and sociolinguistic approach
    Salaberri, Iker
    DIACHRONICA, 2021, 38 (02) : 259 - 301