Training data selection for improving discriminative training of acoustic models

被引:9
|
作者
Liu, Shih-Hung [1 ]
Chu, Fang-Hui [1 ]
Lin, Shih-Hsiang [1 ]
Lee, Hung-Shin [1 ]
Chen, Berlin [1 ]
机构
[1] Natl Taiwan Normal Univ, Grad Inst Comp Sci & Informat Engn, Taipei, Taiwan
关键词
speech recognition; discriminative training; acoustic models; data selection; entropy;
D O I
10.1109/ASRU.2007.4430125
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper considers training data selection for discriminative training of acoustic models for broadcast news speech recognition. Three novel data selection approaches were proposed. First, the average phone accuracy over all hypothesized word sequences in the word lattice of a training utterance was utilized for utterance-level data selection. Second, phone-level data selection based on the difference between the expected accuracy of a phone arc and the average phone accuracy of the word lattice was investigated. Finally, frame-level data selection based on the normalized frame-level entropy of Gaussian posterior probabilities obtained from the word lattice was explored. The underlying characteristics of the presented approaches were extensively investigated and their performance was verified by comparison with the standard discriminative training approaches. Experiments conducted on the Mandarin broadcast news collected in Taiwan shown that both phone- and frame-level data selection could achieve slight but consistent improvements over the baseline systems at lower training iterations.
引用
收藏
页码:284 / 289
页数:6
相关论文
共 50 条
  • [41] Selection and Training Schemes for Improving TTS Voice Built on Found Data
    Kuo, F-Y
    Ouyang, I. C.
    Aryal, S.
    Lanchantin, P.
    INTERSPEECH 2019, 2019, : 1516 - 1520
  • [42] DynImpt: A Dynamic Data Selection Method for Improving Model Training Efficiency
    Huang, Wei
    Zhang, Yunxiao
    Guo, Shangmin
    Shang, Yu-Ming
    Fu, Xiangling
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (01) : 239 - 252
  • [43] Discriminative Training and Unsupervised Adaptation for Labeling Prosodic Events with Limited Training Data
    Fernandez, Raul
    Ramabhadran, Bhuvana
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1429 - 1432
  • [44] Investigations on discriminative training in large scale acoustic model estimation
    Pylkkonen, Janne
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 244 - 247
  • [45] Discriminative training of CRF models with probably submodular constraints
    Zaremba, Wojciech
    Blaschko, Matthew B.
    2016 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2016), 2016,
  • [46] DISCRIMINATIVE TRAINING FOR BAYESIAN SENSING HIDDEN MARKOV MODELS
    Saon, George
    Chien, Jen-Tzung
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5316 - 5319
  • [47] Maximum likelihood and discriminative training of direct translation models
    Papineni, KA
    Roukos, S
    Ward, RT
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 189 - 192
  • [48] Iterative Training of Discriminative Models for the Generalized Hough Transform
    Ruppertshofen, Heike
    Lorenz, Cristian
    Schmidt, Sarah
    Beyerlein, Peter
    Salah, Zein
    Rose, Georg
    Schramm, Hauke
    MEDICAL COMPUTER VISION: RECOGNITION TECHNIQUES AND APPLICATIONS IN MEDICAL IMAGING, 2011, 6533 : 21 - +
  • [49] Hope and Fear for Discriminative Training of Statistical Translation Models
    Chiang, David
    JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 1159 - 1187
  • [50] A new look at discriminative training for hidden Markov models
    He, Xiaodong
    Deng, Li
    PATTERN RECOGNITION LETTERS, 2007, 28 (11) : 1285 - 1294