Tone model integration based on discriminative weight training for Putonghua speech recognition

被引:0
|
作者
HUANG Hao ZHU Jie (Department of Electronic Engineering
机构
关键词
mode; MPE; Tone model integration based on discriminative weight training for Putonghua speech recognition; FMD; TSD; SFM; MCD; HMM;
D O I
10.15949/j.cnki.0217-9776.2008.03.007
中图分类号
TN912.34 [语音识别与设备];
学科分类号
摘要
A discriminative framework of tone model integration in continuous speech recog- nition was proposed.The method uses model dependent weights to scale probabilities of the hidden Markov models based on spectral features and tone models based on tonal features. The weights are discriminatively trained by minimum phone error criterion.Update equation of the model weights based on extended Baum-Welch algorithm is derived.Various schemes of model weight combination are evaluated and a smoothing technique is introduced to make training robust to over fitting.The proposed method is evaluated on tonal syllable output and character output speech recognition tasks.The experimental results show the proposed method has obtained 9.5% and 4.7% relative error reduction than global weight on the two tasks due to a better interpolation of the given models.This proves the effectiveness of discriminative trained model weights for tone model integration.
引用
收藏
页码:193 / 202
页数:10
相关论文
共 50 条
  • [31] A MODEL STRUCTURE INTEGRATION BASED ON A BAYESIAN FRAMEWORK FOR SPEECH RECOGNITION
    Shiota, Sayaka
    Hashimoto, Kei
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4813 - 4816
  • [32] Discriminative training based on the criterion of least phone competing tokens for large vocabulary speech recognition
    Liu, B
    Jiang, H
    Zhou, JL
    Wang, RH
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 117 - 120
  • [33] Discriminative training of auditory filters of different shapes for robust speech recognition
    Mak, B
    Tam, YC
    Hsiao, R
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 45 - 48
  • [34] Hybrid speech recognition system with discriminative training applied for Romanian language
    Gavat, I
    Zirra, M
    Cula, O
    MELECON '98 - 9TH MEDITERRANEAN ELECTROTECHNICAL CONFERENCE, VOLS 1 AND 2, 1998, : 11 - 15
  • [35] DISCRIMINATIVE TRAINING FOR SPEECH RECOGNITION IS COMPENSATING FOR STATISTICAL DEPENDENCE IN THE HMM FRAMEWORK
    Gillick, Dan
    Wegmann, Steven
    Gillick, Larry
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4745 - 4748
  • [36] A BOUNDED TRUST REGION OPTIMIZATION FOR DISCRIMINATIVE TRAINING OF HMMS IN SPEECH RECOGNITION
    Liu, Cong
    Hu, Yu
    Jiang, Hui
    Dai, Li-Rong
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4914 - 4917
  • [37] Discriminative training of decoding graphs for large vocabulary continuous speech recognition
    Kuo, Hong-Kwang Jeff
    Kingsbury, Brian
    Zweig, Geoffrey
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 45 - +
  • [38] Large scale discriminative training of hidden Markov models for speech recognition
    Woodland, PC
    Povey, D
    COMPUTER SPEECH AND LANGUAGE, 2002, 16 (01): : 25 - 47
  • [39] Use of Incrementally Regulated Discriminative Margins in MCE Training for Speech Recognition
    Yu, Dong
    Deng, Li
    He, Xiaodong
    Acero, Alex
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2418 - 2421
  • [40] Decision tree based mandarin tone model and its application to speech recognition
    Cao, Y
    Deng, YG
    Zhang, H
    Huang, TY
    Xu, B
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1759 - 1762