Improving transformer-based acoustic model performance using sequence discriminative training

被引：0

作者：

Lee, Chae-Won ^{[1
]}

Chang, Joon-Hyuk ^{[1
]}

机构：

[1] Hanyang Univ, Dept Elect Engn, 222,Wangsimni Ro, Seoul 04763, South Korea

来源：

JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA | 2022年 / 41卷 / 03期

关键词：

Speech recognition; Transformer; Sequence discriminative training; Weighted finite state transducer;

D O I：

10.7776/ASK.2022.41.3.335

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we adopt a transformer that shows remarkable performance in natural language processing as an acoustic model of hybrid speech recognition. The transformer acoustic model uses attention structures to process sequential data and shows high performance with low computational cost. This paper proposes a method to improve the performance of transformer AM by applying each of the four algorithms of sequence discriminative training, a weighted finite-state transducer (wFST)-based learning used in the existing DNN-HMM model. In addition, compared to the Cross Entropy (CE) learning method, sequence discriminative method shows 5 % of the relative Word Error Rate (WER).

引用

页码：335 / 341

页数：7

共 50 条

[21] Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Andoorveedu, Muralidhar
Zhu, Zhanda
Zheng, Bojian
Pekhimenko, Gennady
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[22] In-Context Learning for MIMO Equalization Using Transformer-Based Sequence Models
Zecchin, Matteo
Yu, Kai
Simeone, Osvaldo
2024 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS, ICC WORKSHOPS 2024, 2024, : 1573 - 1578
[23] Leveraging Unlabeled Speech for Sequence Discriminative Training of Acoustic Models
Sapru, Ashtosh
Garimella, Sri
INTERSPEECH 2020, 2020, : 3585 - 3589
[24] High performance binding affinity prediction with a Transformer-based surrogate model
Vasan, Archit
Gokdemir, Ozan
Brace, Alexander
Ramanathan, Arvind
Brettin, Thomas
Stevens, Rick
Vishwanath, Venkatram
2024 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW 2024, 2024, : 571 - 580
[25] Improving scene text image captioning using transformer-based multilevel attention
Srivastava, Swati
Sharma, Himanshu
JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (03)
[26] T-SPP: Improving GNSS Single-Point Positioning Performance Using Transformer-Based Correction
Wu, Fan
Wei, Liangrui
Luo, Haiyong
Zhao, Fang
Ma, Xin
Ning, Bokun
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2024, 2024
[27] Transformer-based temporal sequence learners for arrhythmia classification
Varghese, Ann
Kamal, Suraj
Kurian, James
MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2023, 61 (08) : 1993 - 2000
[28] Transformer-based temporal sequence learners for arrhythmia classification
Ann Varghese
Suraj Kamal
James Kurian
Medical & Biological Engineering & Computing, 2023, 61 : 1993 - 2000
[29] Ouroboros: On Accelerating Training of Transformer-Based Language Models
Yang, Qian
Huo, Zhouyuan
Wang, Wenlin
Huang, Heng
Carin, Lawrence
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[30] Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory
Wu, Chunyang
Wang, Yongqiang
Shi, Yangyang
Yeh, Ching-Feng
Zhang, Frank
INTERSPEECH 2020, 2020, : 2132 - 2136

← 1 2 3 4 5 →