Simultaneous Discriminative Training and Mixture Splitting of HMMs for Speech Recognition

被引:0
|
作者
Tahir, Muhammad Ali [1 ]
Nussbaum-Thom, Markus [1 ]
Schlueter, Ralf [1 ]
Ney, Hermann [1 ]
机构
[1] Rhein Westfal TH Aachen, Dept Comp Sci, Lehrstuhl Informat 6, Aachen, Germany
关键词
speech recognition; log linear modelling; discriminative training; MODELS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A method is proposed to incorporate mixture density splitting into the acoustic model discriminative training for speech recognition. The standard method is to obtain a high resolution acoustic model by maximum likelihood training and density splitting, and then improving this model by discriminative training. We choose a log-linear form of acoustic model because for a single Gaussian density per triphone state the log-linear MMI optimization is a convex optimization problem, and by further splitting and discriminative training of this model we can get a higher complexity model. Previously it was shown that we achieve large gains in the objective function and corresponding moderate gains in the word error rate on a large vocabulary corpus. This paper incorporates the state of the art minimum phone error training criterion into the framework, and shows that after discriminative splitting, a subsequent log-linear MPE training achieves better results than Gaussian mixture model MPE optimization alone.
引用
收藏
页码:570 / 573
页数:4
相关论文
共 50 条
  • [21] Discriminative training of stochastic Markov graphs for speech recognition
    Wolfertstetter, F
    Ruske, G
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 581 - 584
  • [22] Boosting HMMS with an application to speech recognition
    Dimitrakakis, C
    Bengio, S
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 621 - 624
  • [23] Large margin HMMS for speech recognition
    Li, XW
    Jiang, H
    Liu, CJ
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 513 - 516
  • [24] Discriminative Bernoulli HMMs for isolated handwritten word recognition
    Gimenez, Adria
    Andres-Ferrer, Jesus
    Juan, Alfons
    PATTERN RECOGNITION LETTERS, 2014, 35 : 157 - 168
  • [25] A constrained line search optimization for discriminative training in speech recognition
    Liu, Cong
    Liu, Peng
    Jiang, Hui
    Soong, Frank
    Wang, Ren-Hua
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 329 - +
  • [26] Discriminative estimation of subspace constrained Gaussian mixture models for speech recognition
    Axelrod, Scott
    Goel, Vaibhava
    Gopinath, Ramesh
    Olsen, Peder
    Visweswariah, Karthik
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (01): : 172 - 189
  • [27] Towards discriminative training estimators for HMM speech recognition system
    Frikha, Mondher
    Messaoud, Z. Ben
    Hamida, A. Ben
    Journal of Applied Sciences, 2007, 7 (24) : 3891 - 3899
  • [28] Comparison of discriminative training criteria and optimization methods for speech recognition
    Schlüter, R
    Macherey, W
    Müller, B
    Ney, H
    SPEECH COMMUNICATION, 2001, 34 (03) : 287 - 310
  • [29] OVERVIEW OF LARGE SCALE OPTIMIZATION FOR DISCRIMINATIVE TRAINING IN SPEECH RECOGNITION
    Kanevsky, Dimitri
    Heigold, Georg
    Wright, Stephen
    Ney, Hermann
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 5233 - 5236
  • [30] Histogram equalization for noise-robust speech recognition using discrete-mixture HMMs
    Kosaka, Tetsuo
    Katoh, Masaharu
    Kohda, Masaki
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2008, 29 (01) : 66 - 73