Transformer-Based Direct Hidden Markov Model for Machine Translation

被引:0
|
作者
Wang, Weiyue [1 ]
Yang, Zijian [1 ]
Gao, Yingbo [1 ]
Ney, Hermann [1 ]
机构
[1] Rhein Westfal TH Aachen, Comp Sci Dept, Human Language Technol & Pattern Recognit Grp, Aachen, Germany
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The neural hidden Markov model has been proposed as an alternative to attention mechanism in machine translation with recurrent neural networks. However, since the introduction of the transformer models, its performance has been surpassed. This work proposes to introduce the concept of the hidden Markov model to the transformer architecture, which outperforms the transformer baseline. Interestingly, we find that the zero-order model already provides promising performance, giving it an edge compared to a model with first-order dependency, which performs similarly but is significantly slower in training and decoding.
引用
收藏
页码:23 / 32
页数:10
相关论文
共 50 条
  • [1] Training and analyzing a Transformer-based machine translation model
    Pimentel, Clovis Henrique Martins
    Pires, Thiago Blanch
    TEXTO LIVRE-LINGUAGEM E TECNOLOGIA, 2024, 17
  • [2] Neural Hidden Markov Model for Machine Translation
    Wang, Weiyue
    Zhu, Derui
    Alkhouli, Tamer
    Gan, Zixuan
    Ney, Hermann
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 377 - 382
  • [3] Transformer-Based Unified Neural Network for Quality Estimation and Transformer-Based Re-decoding Model for Machine Translation
    Chen, Cong
    Zong, Qinqin
    Luo, Qi
    Qiu, Bailian
    Li, Maoxi
    MACHINE TRANSLATION, CCMT 2020, 2020, 1328 : 66 - 75
  • [4] Learning Confidence for Transformer-based Neural Machine Translation
    Lu, Yu
    Zeng, Jiali
    Zhang, Jiajun
    Wu, Shuangzhi
    Li, Mu
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 2353 - 2364
  • [5] On compositional generalization of transformer-based neural machine translation
    Yin, Yongjing
    Fu, b Lian
    Li, Yafu
    Zhang, Yue
    INFORMATION FUSION, 2024, 111
  • [6] Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation
    Sant, Gerard
    Gállego, Gerard I.
    Alastruey, Belen
    Costa-Jussà, Marta R.
    NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Student Research Workshop, 2022, : 277 - 284
  • [7] Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation
    Sant, Gerard
    Gallego, Gerard, I
    Alastruey, Belen
    Costa-Jussa, Marta R.
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2022, : 277 - 284
  • [8] A Transformer-Based Hierarchical Variational AutoEncoder Combined Hidden Markov Model for Long Text Generation
    Zhao, Kun
    Ding, Hongwei
    Ye, Kai
    Cui, Xiaohui
    ENTROPY, 2021, 23 (10)
  • [9] Debugging Translations of Transformer-based Neural Machine Translation Systems
    Rikters, Matiss
    Pinnis, Marcis
    BALTIC JOURNAL OF MODERN COMPUTING, 2018, 6 (04): : 403 - 417
  • [10] Regression Loss in Transformer-based Supervised Neural Machine Translation
    Li, Dongxing
    Luo, Zuying
    INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2021, 16 (04) : 1 - 17