Transformer-Based Direct Hidden Markov Model for Machine Translation

被引:0
|
作者
Wang, Weiyue [1 ]
Yang, Zijian [1 ]
Gao, Yingbo [1 ]
Ney, Hermann [1 ]
机构
[1] Rhein Westfal TH Aachen, Comp Sci Dept, Human Language Technol & Pattern Recognit Grp, Aachen, Germany
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The neural hidden Markov model has been proposed as an alternative to attention mechanism in machine translation with recurrent neural networks. However, since the introduction of the transformer models, its performance has been surpassed. This work proposes to introduce the concept of the hidden Markov model to the transformer architecture, which outperforms the transformer baseline. Interestingly, we find that the zero-order model already provides promising performance, giving it an edge compared to a model with first-order dependency, which performs similarly but is significantly slower in training and decoding.
引用
收藏
页码:23 / 32
页数:10
相关论文
共 50 条
  • [31] SignNet II: A Transformer-Based Two-Way Sign Language Translation Model
    Chaudhary, Lipisha
    Ananthanarayana, Tejaswini
    Hoq, Enjamamul
    Nwogu, Ifeoma
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 12896 - 12907
  • [32] EMG processing based on hidden Markov model and support vector machine
    Chen, LL
    Yang, P
    Guo, X
    Wang, H
    ICEMI 2005: CONFERENCE PROCEEDINGS OF THE SEVENTH INTERNATIONAL CONFERENCE ON ELECTRONIC MEASUREMENT & INSTRUMENTS, VOL 6, 2005, : 245 - 249
  • [34] Improving Language Translation Using the Hidden Markov Model
    Chang, Yunpeng
    Wang, Xiaoliang
    Xue, Meihua
    Liu, Yuzhen
    Jiang, Frank
    CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 67 (03): : 3921 - 3931
  • [35] On Block g-Circulant Matrices with Discrete Cosine and Sine Transforms for Transformer-Based Translation Machine
    Asriani, Euis
    Muchtadi-Alamsyah, Intan
    Purwarianti, Ayu
    MATHEMATICS, 2024, 12 (11)
  • [36] Translation template learning based on hidden Markov modeling
    Le, NN
    Shimazu, A
    Horiguchi, S
    PACLIC 17: Language, Information and Computation, Proceedings, 2003, : 269 - 276
  • [37] Direct conversion of peptides into diverse peptidomimetics using a transformer-based chemical language model
    Yoshimori, Atsushi
    Bajorath, Juergen
    EUROPEAN JOURNAL OF MEDICINAL CHEMISTRY REPORTS, 2025, 13
  • [38] Vision Transformer-Based Photovoltaic Prediction Model
    Kang, Zaohui
    Xue, Jizhong
    Lai, Chun Sing
    Wang, Yu
    Yuan, Haoliang
    Xu, Fangyuan
    ENERGIES, 2023, 16 (12)
  • [39] Transformer-Based Model for Electrical Load Forecasting
    L'Heureux, Alexandra
    Grolinger, Katarina
    Capretz, Miriam A. M.
    ENERGIES, 2022, 15 (14)
  • [40] Transformer-Based Model for Auditory EEG Decoding
    Chen, Jiaxin
    Liu, Yin-Long
    Feng, Rui
    Yuan, Jiahong
    Ling, Zhen-Hua
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 129 - 143