Training data selection for improving discriminative training of acoustic models

被引:9
|
作者
Liu, Shih-Hung [1 ]
Chu, Fang-Hui [1 ]
Lin, Shih-Hsiang [1 ]
Lee, Hung-Shin [1 ]
Chen, Berlin [1 ]
机构
[1] Natl Taiwan Normal Univ, Grad Inst Comp Sci & Informat Engn, Taipei, Taiwan
关键词
speech recognition; discriminative training; acoustic models; data selection; entropy;
D O I
10.1109/ASRU.2007.4430125
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper considers training data selection for discriminative training of acoustic models for broadcast news speech recognition. Three novel data selection approaches were proposed. First, the average phone accuracy over all hypothesized word sequences in the word lattice of a training utterance was utilized for utterance-level data selection. Second, phone-level data selection based on the difference between the expected accuracy of a phone arc and the average phone accuracy of the word lattice was investigated. Finally, frame-level data selection based on the normalized frame-level entropy of Gaussian posterior probabilities obtained from the word lattice was explored. The underlying characteristics of the presented approaches were extensively investigated and their performance was verified by comparison with the standard discriminative training approaches. Experiments conducted on the Mandarin broadcast news collected in Taiwan shown that both phone- and frame-level data selection could achieve slight but consistent improvements over the baseline systems at lower training iterations.
引用
收藏
页码:284 / 289
页数:6
相关论文
共 50 条
  • [21] A Regularized Discriminative Training Method of Acoustic Models Derived by Minimum Relative Entropy Discrimination
    Kubo, Yotaro
    Watanabe, Shinji
    Nakamura, Atsushi
    Kobayashi, Tetsunori
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2954 - +
  • [22] IMPROVING SELECTION AND TRAINING OF ATTENDANTS
    PRYER, RS
    DISTEFANO, MK
    PRYER, MW
    HOSPITAL AND COMMUNITY PSYCHIATRY, 1967, 18 (06): : 169 - 170
  • [23] REDUCTION OF ACOUSTIC MODEL TRAINING TIME AND REQUIRED DATA PASSES VIA STOCHASTIC APPROACHES TO MAXIMUM LIKELIHOOD AND DISCRIMINATIVE TRAINING
    Novak, Petr
    Otec, Roman
    Lee, Antonio
    Goel, Vaibhava
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [24] Discriminative Training and Channel Compensation for Acoustic Language Recognition
    Hubeika, Valiantsina
    Burget, Lukas
    Matejka, Pavel
    Schwarz, Petr
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 301 - 304
  • [25] Acoustic Language Identification Using Fast Discriminative Training
    Castaldo, Fabio
    Colibro, Daniele
    Dalmasso, Emanuele
    Laface, Pietro
    Vair, Claudio
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 389 - +
  • [26] Improving Hyperspectral Pixel Classification With Unsupervised Training Data Selection
    Rajadell, Olga
    Garcia-Sevilla, Pedro
    Viet Cuong Dinh
    Duin, Robert P. W.
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2014, 11 (03) : 656 - 660
  • [27] Training Discriminative Models to Evaluate Generative Ones
    Lesort, Timothee
    Stoain, Andrei
    Goudou, Jean-Francois
    Filliat, David
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: IMAGE PROCESSING, PT III, 2019, 11729 : 604 - 619
  • [28] A discriminative training algorithm for hidden Markov models
    Ben-Yishai, A
    Burshtein, D
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (03): : 204 - 217
  • [29] A comparison of training approaches for discriminative segmental models
    Tang, Hao
    Gimpel, Kevin
    Livescu, Karen
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1219 - 1223
  • [30] Selection, parameter estimation, and discriminative training of hidden Markov models for general audio modeling
    Reyes-Gomez, MJ
    Ellis, DPW
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I, PROCEEDINGS, 2003, : 73 - 76