Simultaneous Discriminative Training and Mixture Splitting of HMMs for Speech Recognition

被引：0

作者：

Tahir, Muhammad Ali ^{[1
]}

Nussbaum-Thom, Markus ^{[1
]}

Schlueter, Ralf ^{[1
]}

Ney, Hermann ^{[1
]}

机构：

[1] Rhein Westfal TH Aachen, Dept Comp Sci, Lehrstuhl Informat 6, Aachen, Germany

来源：

13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3 | 2012年

关键词：

speech recognition; log linear modelling; discriminative training; MODELS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A method is proposed to incorporate mixture density splitting into the acoustic model discriminative training for speech recognition. The standard method is to obtain a high resolution acoustic model by maximum likelihood training and density splitting, and then improving this model by discriminative training. We choose a log-linear form of acoustic model because for a single Gaussian density per triphone state the log-linear MMI optimization is a convex optimization problem, and by further splitting and discriminative training of this model we can get a higher complexity model. Previously it was shown that we achieve large gains in the objective function and corresponding moderate gains in the word error rate on a large vocabulary corpus. This paper incorporates the state of the art minimum phone error training criterion into the framework, and shows that after discriminative splitting, a subsequent log-linear MPE training achieves better results than Gaussian mixture model MPE optimization alone.

引用

页码：570 / 573

页数：4

共 50 条

[41] Improved discriminative training techniques for large vocabulary continuous speech recognition
Povey, D
Woodland, PC
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 45 - 48
[42] Frame margin probability discriminative training algorithm for noisy speech recognition
Li, Hao-Zheng
O'Shaughnessy, Douglas
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 101 - 104
[43] On a Generalization of Margin-Based Discriminative Training to Robust Speech Recognition
Li, Jinyu
Lee, Chin-Hui
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1992 - 1995
[44] An Ellipsoid Constrained Quadratic Programming Perspective to Discriminative Training of HMMs
Liu, Peng
Soong, Frank
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 281 - 284
[45] A Successive State and Mixture Splitting for Optimizing the Size of Models in Speech Recognition
Suk, Soo-Young
Hahm, Seong-Jun
Jung, Ho-Youl
Chung, Hyun-Yeol
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 629 - +
[46] Simultaneous Estimation of Confidence and Error Cause in Speech Recognition Using Discriminative Model
Ogawa, Atsunori
Nakamura, Atsushi
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1203 - 1206
[47] HMMs and OWE neural network for continuous speech recognition
Pican, N
Fohr, D
Mari, JF
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1309 - 1312
[48] A Shrinkage Estimator for Speech Recognition with Full Covariance HMMs
Bell, Peter
King, Simon
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 910 - 913
[49] NONUNIFORM UNIT BASED HMMS FOR CONTINUOUS SPEECH RECOGNITION
MATSUMURA, T
MATSUNAGA, S
SPEECH COMMUNICATION, 1995, 17 (3-4) : 321 - 329
[50] DISCRIMINATIVE TRAINING OF HIERARCHICAL ACOUSTIC MODELS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
Chang, Hung-An
Glass, James R.
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4481 - 4484

← 1 2 3 4 5 →