A Non-Autoregressive Neural Machine Translation Model With Iterative Length Update of Target Sentence

被引:0
|
作者
Lim, Yeon-Soo [1 ]
Park, Eun-Ju [1 ]
Song, Hyun-Je [2 ]
Park, Seong-Bae [1 ]
机构
[1] Kyung Hee Univ, Dept Comp Sci & Engn, Yonin Si 17104, Gyeonggi Do, South Korea
[2] Jeonbuk Natl Univ, Dept Informat & Engn, Jeonju Si 54896, Jeollabuk Do, South Korea
基金
新加坡国家研究基金会;
关键词
Decoding; Iterative decoding; Generators; Adaptation models; Machine translation; Predictive models; Transformers; non-autoregressive decoder; sequence-to-sequence model; target length adaptation; transformer;
D O I
10.1109/ACCESS.2022.3169419
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The non-autoregressive decoders in neural machine translation are paid increasing attention due to their faster decoding than autoregressive decoders. However, their apparent problem is a low performance which is mainly originated from wrong prediction about the target sentence length. To attack this problem, this paper proposes a novel machine translation model with a new non-autoregressive decoder named Iterative and Length-Adjustive Non-Autoregressive Decoder (ILAND). This decoder adopts a masked language model to avoid generation of low-confident tokens and changes the length of a target sentence iteratively to an optimal length. To complete these goals, ILAND consists of three complementary sub-modules of a token masker, a length adjuster, and a token generator. The token masker and the token generator take charge of the masked language model, and the length adjuster optimizes the target sentence length. The sequence-to-sequence training of the translation model is also introduced. In this training, the length adjuster and the token generator are jointly trained since they share a similar structure. The effectiveness of the translation model is proven by showing empirically that the model outperforms other models with various non-autoregressive decoders. The thorough analysis suggests that the performance gain of the translation model comes from target sentence length adaptation and the joint learning. In addition, ILAND is also shown to be faster than other iterative non-autoregressive decoders while it is still robust against the multi-modality problem.
引用
收藏
页码:43341 / 43350
页数:10
相关论文
共 50 条
  • [31] Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation
    Shao, Chenze
    Zhang, Jinchao
    Feng, Yang
    Meng, Fandong
    Zhou, Jie
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 198 - 205
  • [32] Non-Autoregressive Neural Machine Translation with Consistency Regularization Optimized Variational Framework
    Zhu, Minghao
    Wang, Junli
    Yan, Chungang
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 607 - 617
  • [33] Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation
    Guo, Junliang
    Tan, Xu
    Xu, Linli
    Qin, Tao
    Chen, Enhong
    Liu, Tie-Yan
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7839 - 7846
  • [34] Non-Autoregressive Translation by Learning Target Categorical Codes
    Bao, Yu
    Huang, Shujian
    Xiao, Tong
    Wang, Dongqi
    Dai, Xinyu
    Chen, Jiajun
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5749 - 5759
  • [35] Incorporating history and future into non-autoregressive machine translation
    Wang, Shuheng
    Huang, Heyan
    Shi, Shumin
    COMPUTER SPEECH AND LANGUAGE, 2022, 77
  • [36] Non-Autoregressive Machine Translation: It's Not as Fast as it Seems
    Helel, Jindrich
    Haddow, Barry
    Birch, Alexandra
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1780 - 1790
  • [37] Aligned Cross Entropy for Non-Autoregressive Machine Translation
    Ghazvininejad, Marjan
    Karpukhin, Vladimir
    Zettlemoyer, Luke
    Levy, Omer
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [38] Non-Autoregressive Document-Level Machine Translation
    Bao, Guangsheng
    Teng, Zhiyang
    Zhou, Hao
    Yan, Jianhao
    Zhang, Yue
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 14791 - 14803
  • [39] Non-autoregressive Machine Translation with Disentangled Context Transformer
    Kasai, Jungo
    Cross, James
    Ghazvininejad, Marjan
    Gu, Jiatao
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [40] Efficient Domain Adaptation for Non-Autoregressive Machine Translation
    You, Wangjie
    Guo, Pei
    Li, Juntao
    Chen, Kehai
    Zhang, Min
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 13657 - 13670