Improving transformer-based acoustic model performance using sequence discriminative training

被引:0
|
作者
Lee, Chae-Won [1 ]
Chang, Joon-Hyuk [1 ]
机构
[1] Hanyang Univ, Dept Elect Engn, 222,Wangsimni Ro, Seoul 04763, South Korea
来源
关键词
Speech recognition; Transformer; Sequence discriminative training; Weighted finite state transducer;
D O I
10.7776/ASK.2022.41.3.335
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we adopt a transformer that shows remarkable performance in natural language processing as an acoustic model of hybrid speech recognition. The transformer acoustic model uses attention structures to process sequential data and shows high performance with low computational cost. This paper proposes a method to improve the performance of transformer AM by applying each of the four algorithms of sequence discriminative training, a weighted finite-state transducer (wFST)-based learning used in the existing DNN-HMM model. In addition, compared to the Cross Entropy (CE) learning method, sequence discriminative method shows 5 % of the relative Word Error Rate (WER).
引用
收藏
页码:335 / 341
页数:7
相关论文
共 50 条
  • [21] Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
    Andoorveedu, Muralidhar
    Zhu, Zhanda
    Zheng, Bojian
    Pekhimenko, Gennady
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [22] In-Context Learning for MIMO Equalization Using Transformer-Based Sequence Models
    Zecchin, Matteo
    Yu, Kai
    Simeone, Osvaldo
    2024 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS, ICC WORKSHOPS 2024, 2024, : 1573 - 1578
  • [23] Leveraging Unlabeled Speech for Sequence Discriminative Training of Acoustic Models
    Sapru, Ashtosh
    Garimella, Sri
    INTERSPEECH 2020, 2020, : 3585 - 3589
  • [24] High performance binding affinity prediction with a Transformer-based surrogate model
    Vasan, Archit
    Gokdemir, Ozan
    Brace, Alexander
    Ramanathan, Arvind
    Brettin, Thomas
    Stevens, Rick
    Vishwanath, Venkatram
    2024 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW 2024, 2024, : 571 - 580
  • [25] Improving scene text image captioning using transformer-based multilevel attention
    Srivastava, Swati
    Sharma, Himanshu
    JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (03)
  • [26] T-SPP: Improving GNSS Single-Point Positioning Performance Using Transformer-Based Correction
    Wu, Fan
    Wei, Liangrui
    Luo, Haiyong
    Zhao, Fang
    Ma, Xin
    Ning, Bokun
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2024, 2024
  • [27] Transformer-based temporal sequence learners for arrhythmia classification
    Varghese, Ann
    Kamal, Suraj
    Kurian, James
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2023, 61 (08) : 1993 - 2000
  • [28] Transformer-based temporal sequence learners for arrhythmia classification
    Ann Varghese
    Suraj Kamal
    James Kurian
    Medical & Biological Engineering & Computing, 2023, 61 : 1993 - 2000
  • [29] Ouroboros: On Accelerating Training of Transformer-Based Language Models
    Yang, Qian
    Huo, Zhouyuan
    Wang, Wenlin
    Huang, Heng
    Carin, Lawrence
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [30] Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory
    Wu, Chunyang
    Wang, Yongqiang
    Shi, Yangyang
    Yeh, Ching-Feng
    Zhang, Frank
    INTERSPEECH 2020, 2020, : 2132 - 2136