A Non-Autoregressive Neural Machine Translation Model With Iterative Length Update of Target Sentence

被引:0
|
作者
Lim, Yeon-Soo [1 ]
Park, Eun-Ju [1 ]
Song, Hyun-Je [2 ]
Park, Seong-Bae [1 ]
机构
[1] Kyung Hee Univ, Dept Comp Sci & Engn, Yonin Si 17104, Gyeonggi Do, South Korea
[2] Jeonbuk Natl Univ, Dept Informat & Engn, Jeonju Si 54896, Jeollabuk Do, South Korea
基金
新加坡国家研究基金会;
关键词
Decoding; Iterative decoding; Generators; Adaptation models; Machine translation; Predictive models; Transformers; non-autoregressive decoder; sequence-to-sequence model; target length adaptation; transformer;
D O I
10.1109/ACCESS.2022.3169419
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The non-autoregressive decoders in neural machine translation are paid increasing attention due to their faster decoding than autoregressive decoders. However, their apparent problem is a low performance which is mainly originated from wrong prediction about the target sentence length. To attack this problem, this paper proposes a novel machine translation model with a new non-autoregressive decoder named Iterative and Length-Adjustive Non-Autoregressive Decoder (ILAND). This decoder adopts a masked language model to avoid generation of low-confident tokens and changes the length of a target sentence iteratively to an optimal length. To complete these goals, ILAND consists of three complementary sub-modules of a token masker, a length adjuster, and a token generator. The token masker and the token generator take charge of the masked language model, and the length adjuster optimizes the target sentence length. The sequence-to-sequence training of the translation model is also introduced. In this training, the length adjuster and the token generator are jointly trained since they share a similar structure. The effectiveness of the translation model is proven by showing empirically that the model outperforms other models with various non-autoregressive decoders. The thorough analysis suggests that the performance gain of the translation model comes from target sentence length adaptation and the joint learning. In addition, ILAND is also shown to be faster than other iterative non-autoregressive decoders while it is still robust against the multi-modality problem.
引用
收藏
页码:43341 / 43350
页数:10
相关论文
共 50 条
  • [41] Aligned Cross Entropy for Non-Autoregressive Machine Translation
    Ghazvininejad, Marjan
    Karpukhin, Vladimir
    Zettlemoyer, Luke
    Levy, Omer
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [42] End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification
    Libovicky, Jindrich
    Helcl, Jindrich
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3016 - 3021
  • [43] AligNART: Non-autoregressive Neural Machine Translation by Jointly Learning to Estimate Alignment and Translate
    Song, Jongyoon
    Kim, Sungwon
    Yoon, Sungroh
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1 - 14
  • [44] Multilingual Non-Autoregressive Machine Translation without Knowledge Distillation
    Huang, Chenyang
    Huang, Fei
    Zheng, Zaixiang
    Zaiane, Osmar
    Zhou, Hao
    Mou, Lili
    13TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING AND THE 3RD CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, IJCNLP-AACL 2023, 2023, : 161 - 170
  • [45] Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement
    Lee, Jason
    Mansimov, Elman
    Cho, Kyunghyun
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1173 - 1182
  • [46] Chimera Model of Candidate Soups for Non-Autoregressive Translation
    Zheng, Huanran
    Zhu, Wei
    Wang, Xiaoling
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2024, PT 2, 2025, 14851 : 416 - 425
  • [47] Hint-Based Training for Non-Autoregressive Machine Translation
    Li, Zhuohan
    Lin, Zi
    He, Di
    Tian, Fei
    Qin, Tao
    Wang, Liwei
    Liu, Tie-Yan
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5708 - 5713
  • [48] Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation
    Ran, Qiu
    Lin, Yankai
    Li, Peng
    Zhou, Jie
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3059 - 3069
  • [49] How does Length Prediction Influence the Performance of Non-Autoregressive Translation?
    Wang, Minghan
    Guo, Jiaxin
    Wang, Yuxia
    Chen, Yimeng
    Su, Chang
    Shang, Hengchao
    Zhang, Min
    Tao, Shimin
    Yang, Hao
    BlackboxNLP 2021 - Proceedings of the 4th BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 2021, : 205 - 213
  • [50] Correcting translation for non-autoregressive transformer
    Wang, Shuheng
    Huang, Heyan
    Shi, Shumin
    Li, Dongbai
    Guo, Dongen
    APPLIED SOFT COMPUTING, 2025, 168