Pre-training neural machine translation with alignment information via optimal transport

被引:0
|
作者
Su, Xueping [1 ]
Zhao, Xingkai [1 ]
Ren, Jie [1 ]
Li, Yunhong [1 ]
Raetsch, Matthias [2 ]
机构
[1] Xian Polytech Univ, Sch Elect & Informat, Xian, Peoples R China
[2] Reutlingen Univ, Dept Engn, Interact & Mobile Robot & Artificial Intelligence, Reutlingen, Germany
基金
中国国家自然科学基金;
关键词
Optimal Transport; Alignment Information; Pre-training; Neural Machine Translation;
D O I
10.1007/s11042-023-17479-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid development of globalization, the demand for translation between different languages is also increasing. Although pre-training has achieved excellent results in neural machine translation, the existing neural machine translation has almost no high-quality suitable for specific fields. Alignment information, so this paper proposes a pre-training neural machine translation with alignment information via optimal transport. First, this paper narrows the representation gap between different languages by using OTAP to generate domain-specific data for information alignment, and learns richer semantic information. Secondly, this paper proposes a lightweight model DR-Reformer, which uses Reformer as the backbone network, adds Dropout layers and Reduction layers, reduces model parameters without losing accuracy, and improves computational efficiency. Experiments on the Chinese and English datasets of AI Challenger 2018 and WMT-17 show that the proposed algorithm has better performance than existing algorithms.
引用
收藏
页码:48377 / 48397
页数:21
相关论文
共 50 条
  • [31] Pre-training of Recurrent Neural Networks via Linear Autoencoders
    Pasa, Luca
    Sperduti, Alessandro
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [32] Ponder: Point Cloud Pre-training via Neural Rendering
    Huang, Di
    Peng, Sida
    He, Tong
    Yang, Honghui
    Zhou, Xiaowei
    Ouyang, Wanli
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16043 - 16052
  • [33] Neural Machine Translation Based on XLM-R Cross-lingual Pre-training Language Model
    Wang Q.
    Li M.
    Wu S.
    Wang M.
    Beijing Daxue Xuebao (Ziran Kexue Ban)/Acta Scientiarum Naturalium Universitatis Pekinensis, 2022, 58 (01): : 29 - 36
  • [34] Breaking Corpus Bottleneck for Context-Aware Neural Machine Translation with Cross-Task Pre-training
    Chen, Linqing
    Li, Junhui
    Gong, Zhengxian
    Chen, Boxing
    Luo, Weihua
    Zhang, Min
    Zhou, Guodong
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 2851 - 2861
  • [35] Pre-training via Paraphrasing
    Lewis, Mike
    Ghazvininejad, Marjan
    Ghosh, Gargi
    Aghajanyan, Armen
    Wang, Sida
    Zettlemoyer, Luke
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [36] Multilingual Translation from Denoising Pre-Training
    Tang, Yuqing
    Tran, Chau
    Li, Xian
    Chen, Peng-Jen
    Goyal, Naman
    Chaudhary, Vishrav
    Gu, Jiatao
    Fan, Angela
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3450 - 3466
  • [37] Does Masked Language Model Pre-training with Artificial Data Improve Low-resource Neural Machine Translation?
    Tamura, Hiroto
    Hirasawa, Tosho
    Kim, Hwichan
    Komachi, Mamoru
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 2216 - 2225
  • [38] Pre-training Methods in Information Retrieval
    Fan, Yixing
    Xie, Xiaohui
    Cai, Yinqiong
    Chen, Jia
    Ma, Xinyu
    Li, Xiangsheng
    Zhang, Ruqing
    Guo, Jiafeng
    FOUNDATIONS AND TRENDS IN INFORMATION RETRIEVAL, 2022, 16 (03): : 178 - 317
  • [39] Language Model Pre-training Method in Machine Translation Based on Named Entity Recognition
    Li, Zhen
    Qu, Dan
    Xie, Chaojie
    Zhang, Wenlin
    Li, Yanxia
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2020, 29 (7-8)
  • [40] Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation
    Guerreiro, Nuno M.
    Colombo, Pierre
    Piantanida, Pablo
    Martins, Andre F. T.
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 13766 - 13784