Enhancing distant low-resource neural machine translation with semantic pivot

被引:0
|
作者
Zhu, Enchang [1 ,2 ]
Huang, Yuxin [2 ]
Xian, Yantuan [2 ]
Zhu, Junguo [1 ]
Gao, Minghu [1 ]
Yu, Zhiqiang [1 ]
机构
[1] Yunnan Minzu Univ, Sch Math & Comp Sci, Kunming 650500, Peoples R China
[2] Kunming Univ Sci & Technol, Sch Informat Engn & Automat, Kunming 650500, Peoples R China
基金
中国国家自然科学基金;
关键词
Machine translation; Chinese-Lao; Pivot; Adapter; Similar linguistic feature;
D O I
10.1016/j.aej.2024.12.073
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Prior work has proved that pivot-based method can boost the performance of neural machine translation (NMT). However, in low-resource scenarios, the efficient of pivot-based method is impaired severely due to data sparsity problem. As a typical low-resource language pair, Chinese-Lao NMT suffers the same performance dilemma. In addition, due to the significant linguistic gap between Chinese and Lao, some traditional and effective low-resource translation methods, such as introducing similarity external knowledge, sharing word space, and literal translation, are not suitable for the translation of this language pair. Fortunately, it is highly adaptable to pivot strategy, as there is a pivot language, Thai, which is highly similar to the target language Lao. Here, we propose a novel approach for incorporating similar linguistic features between Thai and Lao into the Chinese-Lao translation model. Firstly, an in-depth linguistic similarity analysis of Thai and Lao is conducted. Secondly, an elaborate pivot-based translation framework with KL adapter is applied. Experiments on the Chinese-Lao translation task show that our approach can help transfer more linguistic knowledges from the Chinese encoder to the Lao decoder via similar linguistic features, achieving substantial improvements compared to the baseline models.
引用
收藏
页码:633 / 643
页数:11
相关论文
共 50 条
  • [31] Regressing Word and Sentence Embeddings for Low-Resource Neural Machine Translation
    Unanue I.J.
    Borzeshi E.Z.
    Piccardi M.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (03): : 450 - 463
  • [32] Neural machine translation for low-resource languages without parallel corpora
    Karakanta, Alina
    Dehdari, Jon
    van Genabith, Josef
    MACHINE TRANSLATION, 2018, 32 (1-2) : 167 - 189
  • [33] Efficient Low-Resource Neural Machine Translation with Reread and Feedback Mechanism
    Yu, Zhiqiang
    Yu, Zhengtao
    Guo, Junjun
    Huang, Yuxin
    Wen, Yonghua
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (03)
  • [34] Hierarchical Transfer Learning Architecture for Low-Resource Neural Machine Translation
    Luo, Gongxu
    Yang, Yating
    Yuan, Yang
    Chen, Zhanheng
    Ainiwaer, Aizimaiti
    IEEE ACCESS, 2019, 7 : 154157 - 154166
  • [35] Translation Memories as Baselines for Low-Resource Machine Translation
    Knowles, Rebecca
    Littell, Patrick
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6759 - 6767
  • [36] Keeping Models Consistent between Pretraining and Translation for Low-Resource Neural Machine Translation
    Zhang, Wenbo
    Li, Xiao
    Yang, Yating
    Dong, Rui
    Luo, Gongxu
    FUTURE INTERNET, 2020, 12 (12): : 1 - 13
  • [37] Transformer-Based Re-Ranking Model for Enhancing Contextual and Syntactic Translation in Low-Resource Neural Machine Translation
    Javed, Arifa
    Zan, Hongying
    Mamyrbayev, Orken
    Abdullah, Muhammad
    Ahmed, Kanwal
    Oralbekova, Dina
    Dinara, Kassymova
    Akhmediyarova, Ainur
    ELECTRONICS, 2025, 14 (02):
  • [38] Pivot-Based Semantic Splicing for Neural Machine Translation
    Liu, Di
    Zhu, Conghui
    Zhao, Tiejun
    Wang, Xiaoxue
    Yang, Muyun
    MACHINE TRANSLATION, 2016, 668 : 14 - 24
  • [39] Machine Translation into Low-resource Language Varieties
    Kumar, Sachin
    Anastasopoulos, Antonios
    Wintner, Shuly
    Tsvetkov, Yulia
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 110 - 121
  • [40] Enhancing Neural Machine Translation with Semantic Units
    Huang, Langlin
    Gu, Shuhao
    Zhang, Zhuocheng
    Feng, Yang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 2264 - 2277