Enhancing distant low-resource neural machine translation with semantic pivot

被引:0
|
作者
Zhu, Enchang [1 ,2 ]
Huang, Yuxin [2 ]
Xian, Yantuan [2 ]
Zhu, Junguo [1 ]
Gao, Minghu [1 ]
Yu, Zhiqiang [1 ]
机构
[1] Yunnan Minzu Univ, Sch Math & Comp Sci, Kunming 650500, Peoples R China
[2] Kunming Univ Sci & Technol, Sch Informat Engn & Automat, Kunming 650500, Peoples R China
基金
中国国家自然科学基金;
关键词
Machine translation; Chinese-Lao; Pivot; Adapter; Similar linguistic feature;
D O I
10.1016/j.aej.2024.12.073
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Prior work has proved that pivot-based method can boost the performance of neural machine translation (NMT). However, in low-resource scenarios, the efficient of pivot-based method is impaired severely due to data sparsity problem. As a typical low-resource language pair, Chinese-Lao NMT suffers the same performance dilemma. In addition, due to the significant linguistic gap between Chinese and Lao, some traditional and effective low-resource translation methods, such as introducing similarity external knowledge, sharing word space, and literal translation, are not suitable for the translation of this language pair. Fortunately, it is highly adaptable to pivot strategy, as there is a pivot language, Thai, which is highly similar to the target language Lao. Here, we propose a novel approach for incorporating similar linguistic features between Thai and Lao into the Chinese-Lao translation model. Firstly, an in-depth linguistic similarity analysis of Thai and Lao is conducted. Secondly, an elaborate pivot-based translation framework with KL adapter is applied. Experiments on the Chinese-Lao translation task show that our approach can help transfer more linguistic knowledges from the Chinese encoder to the Lao decoder via similar linguistic features, achieving substantial improvements compared to the baseline models.
引用
收藏
页码:633 / 643
页数:11
相关论文
共 50 条
  • [41] An empirical study of low-resource neural machine translation of manipuri in multilingual settings
    Salam Michael Singh
    Thoudam Doren Singh
    Neural Computing and Applications, 2022, 34 : 14823 - 14844
  • [42] The Low-Resource Double Bind: An Empirical Study of Pruning for Low-Resource Machine Translation
    Ahia, Orevaoghene
    Kreutzer, Julia
    Hooker, Sara
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 3316 - 3333
  • [43] Improved neural machine translation for low-resource English-Assamese pair
    Laskar, Sahinur Rahman
    Khilji, Abdullah Faiz Ur Rahman
    Pakray, Partha
    Bandyopadhyay, Sivaji
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (05) : 4727 - 4738
  • [44] Pseudotext Injection and Advance Filtering of Low-Resource Corpus for Neural Machine Translation
    Adjeisah, Michael
    Liu, Guohua
    Nyabuga, Douglas Omwenga
    Nortey, Richard Nuetey
    Song, Jinling
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [45] Pre-Training on Mixed Data for Low-Resource Neural Machine Translation
    Zhang, Wenbo
    Li, Xiao
    Yang, Yating
    Dong, Rui
    INFORMATION, 2021, 12 (03)
  • [46] A Bilingual Templates Data Augmentation Method for Low-Resource Neural Machine Translation
    Li, Fuxue
    Liu, Beibei
    Yan, Hong
    Shao, Mingzhi
    Xie, Peijun
    Li, Jiarui
    Chi, Chuncheng
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14877 : 40 - 51
  • [47] An empirical study of low-resource neural machine translation of manipuri in multilingual settings
    Singh, Salam Michael
    Singh, Thoudam Doren
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (17): : 14823 - 14844
  • [48] Multi-granularity Knowledge Sharing in Low-resource Neural Machine Translation
    Mi, Chenggang
    Xie, Shaoliang
    Fan, Yi
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (02)
  • [49] Extremely Low-resource Multilingual Neural Machine Translation for Indic Mizo Language
    Lalrempuii C.
    Soni B.
    International Journal of Information Technology, 2023, 15 (8) : 4275 - 4282
  • [50] Towards better Chinese-centric neural machine translation for low-resource
    Li, Bin
    Weng, Yixuan
    Xia, Fei
    Deng, Hanjun
    COMPUTER SPEECH AND LANGUAGE, 2024, 84