A speaker adaptive Chinese syllable recognition system based on discriminative training

被引:0
|
作者
Zhou, L
Imai, S
机构
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we present two speaker adaptation methods to implement a MSVQ-based adaptive Chinese syllable recognition system. The first proposed method is feature normalization in which we model the inter-speaker variability as a linear transformation. By applying the feature normalization, the target speaker speech is normalized to reduce the inter-speaker acoustic variability. In the second adaptation method, we first present an implementation of the MCE/GPD algorithm for discriminatively training MSVQ-based speech recognizer. It is expected that this method can separate the confusion classes and can enhance speaker adaptation capability. We carried out recognition experiments to assess the performance by using standard Chinese syllable database CRDB in China, the results show that when both adaptation methods are combined, the error rate reduction on open data is over 62% with a single set of adaptation training data. When increasing training data, the capability of speaker adaptation is improved using the MCE/GPD training only. After using 5 sets of training data, the average recognition rate for two new speakers was improved from 72.87% to 97.31% which is best performance reported in this database.
引用
收藏
页码:31 / 36
页数:6
相关论文
共 50 条
  • [21] ONLINE, ADAPTIVE SPEAKER-INDEPENDENT WORD RECOGNITION SYSTEM BASED ON PHONETIC RECOGNITION TECHNIQUES
    LIN, WC
    GANESAN, K
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1975, 58 : S106 - S106
  • [22] Text-independent speaker recognition by combining speaker-specific GMM with speaker adapted syllable-based HMM
    Nakagawa, S
    Zhang, W
    Takahashi, M
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 81 - 84
  • [23] Discriminative power of transient frames in speaker recognition
    Louradour, J
    Daoudi, K
    André-Obrecht, R
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 613 - 616
  • [24] Discriminative training for speaker identification based on maximum model distance algorithm
    Hong, QY
    Kwong, S
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 25 - 28
  • [25] Latent discriminative representation learning for speaker recognition
    Huang, Duolin
    Mao, Qirong
    Ma, Zhongchen
    Zheng, Zhishen
    Routryar, Sidheswar
    Ocquaye, Elias-Nii-Noi
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2021, 22 (05) : 697 - 708
  • [26] An Irrelevant Variability Normalization Based Discriminative Training Approach for Online Handwritten Chinese Character Recognition
    Du, Jun
    Huo, Qiang
    2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 69 - 73
  • [27] CONSTRAINED DISCRIMINATIVE PLDA TRAINING FOR SPEAKER VERIFICATION
    Rohdin, Johan
    Biswas, Sangeeta
    Shinoda, Koichi
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [28] Comparison of discriminative training methods for speaker verification
    Ma, CY
    Chang, E
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 192 - 195
  • [29] Discriminative Training for Hierarchical Clustering in Speaker Diarization
    Vinyals, Oriol
    Friedland, Gerald
    Morgan, Nelson
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2326 - +
  • [30] Speaker recognition via nonlinear phonetic and speaker-discriminative features
    Stoll, Lara
    Frankel, Joe
    Mirghafori, Nikki
    ADVANCES IN NONLINEAR SPEECH PROCESSING, 2007, 4885 : 114 - 123