Aligned Cross Entropy for Non-Autoregressive Machine Translation

被引:0
|
作者
Ghazvininejad, Marjan [1 ]
Karpukhin, Vladimir [1 ]
Zettlemoyer, Luke [1 ]
Levy, Omer [1 ]
机构
[1] Facebook AI Res, Menlo Pk, CA 94025 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Non-autoregressive machine translation models significantly speed up decoding by allowing for parallel prediction of the entire target sequence. However, modeling word order is more challenging due to the lack of autoregressive factors in the model. This difficultly is compounded during training with cross entropy loss, which can highly penalize small shifts in word order. In this paper, we propose aligned cross entropy (AXE) as an alternative loss function for training of non-autoregressive models. AXE uses a differentiable dynamic program to assign loss based on the best possible monotonic alignment between target tokens and model predictions. AXE-based training of conditional masked language models (CMLMs) substantially improves performance on major WMT benchmarks, while setting a new state of the art for non-autoregressive models.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Multilingual Non-Autoregressive Machine Translation without Knowledge Distillation
    Huang, Chenyang
    Huang, Fei
    Zheng, Zaixiang
    Zaiane, Osmar
    Zhou, Hao
    Mou, Lili
    13TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING AND THE 3RD CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, IJCNLP-AACL 2023, 2023, : 161 - 170
  • [22] Selective Knowledge Distillation for Non-Autoregressive Neural Machine Translation
    Liu, Min
    Bao, Yu
    Zhao, Chengqi
    Huang, Shujian
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13246 - 13254
  • [23] Improving Non-autoregressive Neural Machine Translation with Monolingual Data
    Zhou, Jiawei
    Keung, Phillip
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1893 - 1898
  • [24] Non-autoregressive neural machine translation with auxiliary representation fusion
    Du, Quan
    Feng, Kai
    Xu, Chen
    Xiao, Tong
    Zhu, Jingbo
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (06) : 7229 - 7239
  • [25] NON-AUTOREGRESSIVE MACHINE TRANSLATION WITH A NOVEL MASKED LANGUAGE MODEL
    Li Ke
    Li Jie
    Wangjun
    2022 19TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2022,
  • [26] Hint-Based Training for Non-Autoregressive Machine Translation
    Li, Zhuohan
    Lin, Zi
    He, Di
    Tian, Fei
    Qin, Tao
    Wang, Liwei
    Liu, Tie-Yan
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5708 - 5713
  • [27] Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade
    Gu, Jiatao
    Kong, Xiang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 120 - 133
  • [28] A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond
    Xiao Y.
    Wu L.
    Guo J.
    Li J.
    Zhang M.
    Qin T.
    Liu T.-Y.
    IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45 (10) : 11407 - 11427
  • [29] Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input
    Guo, Junliang
    Tan, Xu
    He, Di
    Qin, Tao
    Xu, Linli
    Liu, Tie-Yan
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3723 - 3730
  • [30] Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation
    Shao, Chenze
    Feng, Yang
    Zhang, Jinchao
    Meng, Fandong
    Chen, Xilin
    Zhou, Jie
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3013 - 3024