Transformer-Based Direct Hidden Markov Model for Machine Translation

被引:0
|
作者
Wang, Weiyue [1 ]
Yang, Zijian [1 ]
Gao, Yingbo [1 ]
Ney, Hermann [1 ]
机构
[1] Rhein Westfal TH Aachen, Comp Sci Dept, Human Language Technol & Pattern Recognit Grp, Aachen, Germany
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The neural hidden Markov model has been proposed as an alternative to attention mechanism in machine translation with recurrent neural networks. However, since the introduction of the transformer models, its performance has been surpassed. This work proposes to introduce the concept of the hidden Markov model to the transformer architecture, which outperforms the transformer baseline. Interestingly, we find that the zero-order model already provides promising performance, giving it an edge compared to a model with first-order dependency, which performs similarly but is significantly slower in training and decoding.
引用
收藏
页码:23 / 32
页数:10
相关论文
共 50 条
  • [21] Fusion of Image-text attention for Transformer-based Multimodal Machine Translation
    Ma, Junteng
    Qin, Shihao
    Su, Lan
    Li, Xia
    Xiao, Lixian
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 199 - 204
  • [22] Transformer-Based Re-Ranking Model for Enhancing Contextual and Syntactic Translation in Low-Resource Neural Machine Translation
    Javed, Arifa
    Zan, Hongying
    Mamyrbayev, Orken
    Abdullah, Muhammad
    Ahmed, Kanwal
    Oralbekova, Dina
    Dinara, Kassymova
    Akhmediyarova, Ainur
    ELECTRONICS, 2025, 14 (02):
  • [23] DIRECT : A Transformer-based Model for Decompiled Variable Name Recovery
    Nitin, Vikram
    Saieva, Anthony
    Ray, Baishakhi
    Kaiser, Gail
    NLP4PROG 2021: THE 1ST WORKSHOP ON NATURAL LANGUAGE PROCESSING FOR PROGRAMMING (NLP4PROG 2021), 2021, : 48 - 57
  • [24] An Improved Transformer-Based Neural Machine Translation Strategy: Interacting-Head Attention
    Li, Dongxing
    Luo, Zuying
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [25] Transformer-based Machine Translation for Low-resourced Languages embedded with Language Identification
    Sefara, Tshephisho J.
    Zwane, Skhumbuzo G.
    Gama, Nelisiwe
    Sibisi, Hlawulani
    Senoamadi, Phillemon N.
    Marivate, Vukosi
    2021 CONFERENCE ON INFORMATION COMMUNICATIONS TECHNOLOGY AND SOCIETY (ICTAS), 2021, : 127 - 132
  • [26] Learning hidden relationship between environment and control variables for direct control of automated greenhouse using Transformer-based model
    Lee, Junseo
    Im, Seongil
    Jeong, Jae-Seung
    Lee, Taek Sung
    Park, Soo Hyun
    Shin, Changhwan
    Ju, Hyunsu
    Kim, Hyung-Jun
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2025, 235
  • [27] Comparing Transformer-Based Machine Translation Models for Low-Resource Languages of Colombia and Mexico
    Angel, Jason
    Manuel Meque, Abdul Gafar
    Maldonado-Sifuentes, Christian
    Sidorov, Grigori
    Gelbukh, Alexander
    ADVANCES IN SOFT COMPUTING, MICAI 2023, PT II, 2024, 14392 : 95 - 105
  • [28] Transformer-Based Amharic-to-English Machine Translation With Character Embedding and Combined Regularization Techniques
    Asefa, Surafiel Habib
    Assabie, Yaregal
    IEEE ACCESS, 2025, 13 : 1090 - 1105
  • [29] Chinese named entity recognition method based on Transformer and hidden Markov model
    Li J.
    Xiong Q.
    Hu Y.-T.
    Liu K.-Y.
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2023, 53 (05): : 1427 - 1434
  • [30] Exploring the hidden world of RNA viruses with a transformer-based tool
    Nakagawa, So
    Sakaguchi, Shoichi
    PATTERNS, 2024, 5 (11):