Simultaneous Discriminative Training and Mixture Splitting of HMMs for Speech Recognition

被引:0
|
作者
Tahir, Muhammad Ali [1 ]
Nussbaum-Thom, Markus [1 ]
Schlueter, Ralf [1 ]
Ney, Hermann [1 ]
机构
[1] Rhein Westfal TH Aachen, Dept Comp Sci, Lehrstuhl Informat 6, Aachen, Germany
关键词
speech recognition; log linear modelling; discriminative training; MODELS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A method is proposed to incorporate mixture density splitting into the acoustic model discriminative training for speech recognition. The standard method is to obtain a high resolution acoustic model by maximum likelihood training and density splitting, and then improving this model by discriminative training. We choose a log-linear form of acoustic model because for a single Gaussian density per triphone state the log-linear MMI optimization is a convex optimization problem, and by further splitting and discriminative training of this model we can get a higher complexity model. Previously it was shown that we achieve large gains in the objective function and corresponding moderate gains in the word error rate on a large vocabulary corpus. This paper incorporates the state of the art minimum phone error training criterion into the framework, and shows that after discriminative splitting, a subsequent log-linear MPE training achieves better results than Gaussian mixture model MPE optimization alone.
引用
收藏
页码:570 / 573
页数:4
相关论文
共 50 条
  • [41] Improved discriminative training techniques for large vocabulary continuous speech recognition
    Povey, D
    Woodland, PC
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 45 - 48
  • [42] Frame margin probability discriminative training algorithm for noisy speech recognition
    Li, Hao-Zheng
    O'Shaughnessy, Douglas
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 101 - 104
  • [43] On a Generalization of Margin-Based Discriminative Training to Robust Speech Recognition
    Li, Jinyu
    Lee, Chin-Hui
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1992 - 1995
  • [44] An Ellipsoid Constrained Quadratic Programming Perspective to Discriminative Training of HMMs
    Liu, Peng
    Soong, Frank
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 281 - 284
  • [45] A Successive State and Mixture Splitting for Optimizing the Size of Models in Speech Recognition
    Suk, Soo-Young
    Hahm, Seong-Jun
    Jung, Ho-Youl
    Chung, Hyun-Yeol
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 629 - +
  • [46] Simultaneous Estimation of Confidence and Error Cause in Speech Recognition Using Discriminative Model
    Ogawa, Atsunori
    Nakamura, Atsushi
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1203 - 1206
  • [47] HMMs and OWE neural network for continuous speech recognition
    Pican, N
    Fohr, D
    Mari, JF
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1309 - 1312
  • [48] A Shrinkage Estimator for Speech Recognition with Full Covariance HMMs
    Bell, Peter
    King, Simon
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 910 - 913
  • [49] NONUNIFORM UNIT BASED HMMS FOR CONTINUOUS SPEECH RECOGNITION
    MATSUMURA, T
    MATSUNAGA, S
    SPEECH COMMUNICATION, 1995, 17 (3-4) : 321 - 329
  • [50] DISCRIMINATIVE TRAINING OF HIERARCHICAL ACOUSTIC MODELS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    Chang, Hung-An
    Glass, James R.
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4481 - 4484