Training data selection for improving discriminative training of acoustic models

被引:9
|
作者
Liu, Shih-Hung [1 ]
Chu, Fang-Hui [1 ]
Lin, Shih-Hsiang [1 ]
Lee, Hung-Shin [1 ]
Chen, Berlin [1 ]
机构
[1] Natl Taiwan Normal Univ, Grad Inst Comp Sci & Informat Engn, Taipei, Taiwan
关键词
speech recognition; discriminative training; acoustic models; data selection; entropy;
D O I
10.1109/ASRU.2007.4430125
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper considers training data selection for discriminative training of acoustic models for broadcast news speech recognition. Three novel data selection approaches were proposed. First, the average phone accuracy over all hypothesized word sequences in the word lattice of a training utterance was utilized for utterance-level data selection. Second, phone-level data selection based on the difference between the expected accuracy of a phone arc and the average phone accuracy of the word lattice was investigated. Finally, frame-level data selection based on the normalized frame-level entropy of Gaussian posterior probabilities obtained from the word lattice was explored. The underlying characteristics of the presented approaches were extensively investigated and their performance was verified by comparison with the standard discriminative training approaches. Experiments conducted on the Mandarin broadcast news collected in Taiwan shown that both phone- and frame-level data selection could achieve slight but consistent improvements over the baseline systems at lower training iterations.
引用
收藏
页码:284 / 289
页数:6
相关论文
共 50 条
  • [1] Training data selection for improving discriminative training of acoustic models
    Chen, Berlin
    Liu, Shih-Hung
    Chu, Fang-Hui
    PATTERN RECOGNITION LETTERS, 2009, 30 (13) : 1228 - 1235
  • [2] A variable weighting based training data selection method for discriminative training of acoustic models
    Chen, Bin
    Niu, Tong
    Zhang, Lian-Hai
    Li, Bi-Cheng
    Qu, Dan
    Zidonghua Xuebao/Acta Automatica Sinica, 2014, 40 (12): : 2899 - 2907
  • [3] Discriminative training of acoustic models for system combination
    Tachioka, Yuuki
    Watanabe, Shinji
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2354 - 2358
  • [4] DISCRIMINATIVE IMPORTANCE WEIGHTING OF AUGMENTED TRAINING DATA FOR ACOUSTIC MODEL TRAINING
    Sivasankaran, Sunit
    Vincent, Emmanuel
    Illina, Irina
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4885 - 4889
  • [5] Improving Discriminative Training for Robust Acoustic Models in Large Vocabulary Continuous Speech Recognition
    Pylkkonen, Janne
    Kurimo, Mikko
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1210 - 1213
  • [6] Discriminative Training of Gender-Dependent Acoustic Models
    Vanek, Jan
    Psutka, Josef V.
    Zelinka, Jan
    Prazak, Ales
    Psutka, Josef
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2009, 5729 : 331 - 338
  • [7] Efficient training of discriminative language models by sample selection
    Oba, Takanobu
    Hori, Takaaki
    Nakamura, Atsushi
    SPEECH COMMUNICATION, 2012, 54 (06) : 791 - 800
  • [8] Investigating data selection for minimum phone error training of acoustic models
    Liu, Shih-Hung
    Chu, Fang-Hui
    Lin, Shih-Hsiang
    Chen, Berlin
    2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 348 - 351
  • [9] Leveraging Unlabeled Speech for Sequence Discriminative Training of Acoustic Models
    Sapru, Ashtosh
    Garimella, Sri
    INTERSPEECH 2020, 2020, : 3585 - 3589
  • [10] Discriminative training of acoustic models applied to domains with unreliable transcripts
    Mathias, L
    Yegnanarayanan, G
    Fritsch, J
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 109 - 112