Transformer-Based Direct Hidden Markov Model for Machine Translation

被引：0

作者：

Wang, Weiyue ^{[1
]}

Yang, Zijian ^{[1
]}

Gao, Yingbo ^{[1
]}

Ney, Hermann ^{[1
]}

机构：

[1] Rhein Westfal TH Aachen, Comp Sci Dept, Human Language Technol & Pattern Recognit Grp, Aachen, Germany

来源：

ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP | 2021年

基金：

欧洲研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The neural hidden Markov model has been proposed as an alternative to attention mechanism in machine translation with recurrent neural networks. However, since the introduction of the transformer models, its performance has been surpassed. This work proposes to introduce the concept of the hidden Markov model to the transformer architecture, which outperforms the transformer baseline. Interestingly, we find that the zero-order model already provides promising performance, giving it an edge compared to a model with first-order dependency, which performs similarly but is significantly slower in training and decoding.

引用

页码：23 / 32

页数：10

共 50 条

[1] Training and analyzing a Transformer-based machine translation model
Pimentel, Clovis Henrique Martins
Pires, Thiago Blanch
TEXTO LIVRE-LINGUAGEM E TECNOLOGIA, 2024, 17
[2] Neural Hidden Markov Model for Machine Translation
Wang, Weiyue
Zhu, Derui
Alkhouli, Tamer
Gan, Zixuan
Ney, Hermann
PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 377 - 382
[3] Transformer-Based Unified Neural Network for Quality Estimation and Transformer-Based Re-decoding Model for Machine Translation
Chen, Cong
Zong, Qinqin
Luo, Qi
Qiu, Bailian
Li, Maoxi
MACHINE TRANSLATION, CCMT 2020, 2020, 1328 : 66 - 75
[4] Learning Confidence for Transformer-based Neural Machine Translation
Lu, Yu
Zeng, Jiali
Zhang, Jiajun
Wu, Shuangzhi
Li, Mu
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 2353 - 2364
[5] On compositional generalization of transformer-based neural machine translation
Yin, Yongjing
Fu, b Lian
Li, Yafu
Zhang, Yue
INFORMATION FUSION, 2024, 111
[6] Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation
Sant, Gerard
Gállego, Gerard I.
Alastruey, Belen
Costa-Jussà, Marta R.
NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Student Research Workshop, 2022, : 277 - 284
[7] Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation
Sant, Gerard
Gallego, Gerard, I
Alastruey, Belen
Costa-Jussa, Marta R.
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2022, : 277 - 284
[8] A Transformer-Based Hierarchical Variational AutoEncoder Combined Hidden Markov Model for Long Text Generation
Zhao, Kun
Ding, Hongwei
Ye, Kai
Cui, Xiaohui
ENTROPY, 2021, 23 (10)
[9] Debugging Translations of Transformer-based Neural Machine Translation Systems
Rikters, Matiss
Pinnis, Marcis
BALTIC JOURNAL OF MODERN COMPUTING, 2018, 6 (04): : 403 - 417
[10] Regression Loss in Transformer-based Supervised Neural Machine Translation
Li, Dongxing
Luo, Zuying
INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2021, 16 (04) : 1 - 17

← 1 2 3 4 5 →