Subspace Based Sequence Discriminative Training of LSTM Acoustic Models with Feed-Forward Layers

被引：0

作者：

Samarakoon, Lahiru ^{[1
]}

Mak, Brian ^{[2
]}

Lam, Albert Y. S. ^{[1
]}

机构：

[1] Fano Labs, Hong Kong, Peoples R China

[2] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China

来源：

2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2018年

关键词：

Long Short-Term memory (LSTM); Recurrent Neural Networks (RNNs); Sequence Discriminative Training; Acoustic Modeling; NETWORKS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

State-of-the-art automatic speech recognition (ASR) systems use sequence discriminative training for improved performance over frame-level cross-entropy (CE) criterion. Even though sequence discriminative training improves long short-term memory (LSTM) recurrent neural network (RNN) acoustic models (AMs), it is not clear whether these systems achieve the optimal performance due to overfitting. This paper investigates the effect of state-level minimum Bayes risk (sMBR) training on LSTM AMs and shows that the conventional way of performing sMBR by updating all LSTM parameters is not optimal. We investigate two methods to improve the performance of sequence discriminative training of LSTM AMs. First more feedforward (FF) layers are included between the last LSTM layer and the output layer so those additional FF layers may benefit more from sMBR training. Second, a subspace is estimated as an interpolation of rank-1 matrices when performing sMBR for the LSTM layers of the AM. Our methods are evaluated in benchmark AMI single distance microphone (SDM) task. We find that the proposed approaches provide 1.6% absolute improvement over a strong sMBR trained LSTM baseline.

引用

页码：136 / 140

页数：5

共 50 条

[31] Training algorithm with incomplete data for feed-forward neural networks
Yoon, SY
Lee, SY
NEURAL PROCESSING LETTERS, 1999, 10 (03) : 171 - 179
[32] Discovery of Optimal Neurons and Hidden Layers in Feed-Forward Neural Network
Thomas, Likewin
Kumar, Manoj M., V
Annappa, B.
2016 IEEE INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND INNOVATIVE BUSINESS PRACTICES FOR THE TRANSFORMATION OF SOCIETIES (EMERGITECH), 2016, : 286 - 291
[33] Filtering Training Data When Training Feed-Forward Artificial Neural Network
Moniz, Krishna
Yuan, Yuyu
TRUSTWORTHY COMPUTING AND SERVICES, 2014, 426 : 212 - 218
[34] Discriminative training of acoustic models for system combination
Tachioka, Yuuki
Watanabe, Shinji
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2354 - 2358
[35] The role of nonlinearities in hierarchical feed-forward models for pattern recognition
Eberhardt, S.
Kluth, T.
Fahle, M.
Zetzsche, C.
PERCEPTION, 2012, 41 : 241 - 241
[36] Flat acoustic sources with frequency response correction based on feedback and feed-forward distributed control
Ho, Jen-Hsuan
Berkhoff, Arthur P.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2015, 137 (04): : 2080 - 2088
[37] A variable weighting based training data selection method for discriminative training of acoustic models
Chen, Bin
Niu, Tong
Zhang, Lian-Hai
Li, Bi-Cheng
Qu, Dan
Zidonghua Xuebao/Acta Automatica Sinica, 2014, 40 (12): : 2899 - 2907
[38] Training data selection for improving discriminative training of acoustic models
Liu, Shih-Hung
Chu, Fang-Hui
Lin, Shih-Hsiang
Lee, Hung-Shin
Chen, Berlin
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 284 - 289
[39] Training data selection for improving discriminative training of acoustic models
Chen, Berlin
Liu, Shih-Hung
Chu, Fang-Hui
PATTERN RECOGNITION LETTERS, 2009, 30 (13) : 1228 - 1235
[40] On the design of feed-forward ΣΔ interface for MEMS based accelerometers
Wenquan Liang
Changchun Yang
Donghai Qiao
Analog Integrated Circuits and Signal Processing, 2012, 72 : 3 - 10

← 1 2 3 4 5 →