Training data selection for improving discriminative training of acoustic models

被引:9
|
作者
Liu, Shih-Hung [1 ]
Chu, Fang-Hui [1 ]
Lin, Shih-Hsiang [1 ]
Lee, Hung-Shin [1 ]
Chen, Berlin [1 ]
机构
[1] Natl Taiwan Normal Univ, Grad Inst Comp Sci & Informat Engn, Taipei, Taiwan
关键词
speech recognition; discriminative training; acoustic models; data selection; entropy;
D O I
10.1109/ASRU.2007.4430125
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper considers training data selection for discriminative training of acoustic models for broadcast news speech recognition. Three novel data selection approaches were proposed. First, the average phone accuracy over all hypothesized word sequences in the word lattice of a training utterance was utilized for utterance-level data selection. Second, phone-level data selection based on the difference between the expected accuracy of a phone arc and the average phone accuracy of the word lattice was investigated. Finally, frame-level data selection based on the normalized frame-level entropy of Gaussian posterior probabilities obtained from the word lattice was explored. The underlying characteristics of the presented approaches were extensively investigated and their performance was verified by comparison with the standard discriminative training approaches. Experiments conducted on the Mandarin broadcast news collected in Taiwan shown that both phone- and frame-level data selection could achieve slight but consistent improvements over the baseline systems at lower training iterations.
引用
收藏
页码:284 / 289
页数:6
相关论文
共 50 条
  • [31] Integrating MLP features and discriminative training in data sampling based ensemble acoustic modeling
    Chen, Xin
    Zhao, Yunxin
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1349 - 1352
  • [32] Discriminative training of language models for speech recognition
    Kuo, KHJ
    Fosler-Lussier, E
    Jiang, H
    Lee, CH
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 325 - 328
  • [33] Discriminative cluster refinement: Improving object category recognition given limited training data
    Yang, Liu
    Jin, Rong
    Pantofaru, Caroline
    Sukthankar, Rahul
    2007 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-8, 2007, : 2303 - +
  • [34] An em algorithm for training wideband acoustic models from mixed-bandwidth training data
    Seltzer, ML
    Acero, A
    2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2005, : 197 - 202
  • [35] Training wideband acoustic models using mixed-bandwidth training data for speech recognition
    Seltzer, Michael L.
    Acero, Alex
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (01): : 235 - 245
  • [36] COMPETITIVE TRAINING - A CONNECTIONIST APPROACH TO THE DISCRIMINATIVE TRAINING OF HIDDEN MARKOV-MODELS
    YOUNG, SJ
    IEE PROCEEDINGS-I COMMUNICATIONS SPEECH AND VISION, 1991, 138 (01): : 61 - 68
  • [37] Subspace Based Sequence Discriminative Training of LSTM Acoustic Models with Feed-Forward Layers
    Samarakoon, Lahiru
    Mak, Brian
    Lam, Albert Y. S.
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 136 - 140
  • [38] Improved Spoken Term Detection by Discriminative Training of Acoustic Models based on User Relevance Feedback
    Lee, Hung-yi
    Chen, Chia-ping
    Yeh, Ching-feng
    Lee, Lin-shan
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1273 - 1276
  • [39] Discriminative Acoustic Model for Improving Mispronunciation Detection and Diagnosis in Computer-Aided Pronunciation Training (CAPT)
    Qian, Xiaojun
    Soong, Frank
    Meng, Helen
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 757 - 760
  • [40] DISCRIMINATIVE TRAINING OF WEIGHTED POLYNOMIAL VECTOR FOR ACOUSTIC LANGUAGE RECOGNITION
    Zhang, Ce
    Zheng, Rong
    Xu, Bo
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4849 - 4852