Acoustic model adaptation based on pronunciation variability analysis for non-native speech recognition

被引:20
|
作者
Oh, Yoo Rhee [1 ]
Yoon, Jae Sam [1 ]
Kim, Hong Kook [1 ]
机构
[1] Gwangju Inst Sci & Technol, Dept Informat & Commun, Kwangju 500712, South Korea
关键词
speech recognition; non-native speech; knowledge-based pronunciation variability; data-driven pronunciation variability; state-tying; state-clustering; decision tree; acoustic model adaptation;
D O I
10.1016/j.specom.2006.10.006
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, pronunciation variability between native and non-native speakers is investigated, and a novel acoustic model adaptation method is proposed based on pronunciation variability analysis in order to improve the performance of a speech recognition system by non-native speakers. The proposed acoustic model adaptation method is performed in two steps: analysis of the pronunciation variability of non-native speech, and acoustic model adaptation based on the pronunciation variability analysis. In order to obtain informative variant phonetic units, we analyze the pronunciation variability of non-native speech in two ways: a knowledge-based approach, and a data-driven approach. Next, for each approach, the acoustic model corresponding to each informative variant phonetic unit is adapted such that the state-tying of the acoustic model for non-native speech reflects a phonetic variability. For further improvement, a conventional acoustic model adaptation method such as MLLR and/or MAP is combined with the proposed acoustic model adaptation method. It is shown from the continuous Korean-English speech recognition experiments that the proposed method achieves an average word error rate reduction of 16.76% and 12.80% for the knowledge-based approach and the data-driven approach, respectively, when compared with the baseline speech recognition system trained by native speech. Moreover, a reduction of 53.45% and 57.14% in the average word error rate is obtained by combining MLLR and MAP adaptations to the adapted acoustic models by the proposed method for the knowledge-based approach and the data-driven approach, respectively. (C) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:59 / 70
页数:12
相关论文
共 50 条
  • [1] Acoustic model adaptation based on pronunciation variability analysis for non-native speech recognition
    Oh, Yoo Rhee
    Yoon, Jae Sam
    Kim, Hong Kook
    2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 137 - 140
  • [2] A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition
    Oh, Yoo Rhee
    Kim, Hong Kook
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2379 - 2387
  • [3] Acoustic and pronunciation model adaptation for context-independent and context-dependent pronunciation variability of non-native speech
    Oh, Yoo Rhee
    Kim, Mina
    Kim, Hong Kook
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4281 - 4284
  • [4] Combined Acoustic and Pronunciation Modelling for Non-Native Speech Recognition
    Bouselmi, G.
    Fohr, D.
    Illina, I.
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1209 - +
  • [5] Multilingual recognition of non-native speech using acoustic model transformation and pronunciation modeling
    G. Bouselmi
    D. Fohr
    I. Illina
    International Journal of Speech Technology, 2012, 15 (2) : 203 - 213
  • [6] Multilingual recognition of non-native speech using acoustic model transformation and pronunciation modeling
    Bouselmi, G.
    Fohr, D.
    Illina, I.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (02) : 203 - 213
  • [7] MLLR/MAP Adaptation Using Pronunciation Variation for Non-native Speech Recognition
    Oh, Yoo Rhee
    Kim, Hong Kook
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 216 - 221
  • [8] Acoustic model interpolation for non-native speech recognition
    Tan, Tien-Ping
    Besacier, Laurent
    2007 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol IV, Pts 1-3, 2007, : 1009 - 1012
  • [9] Improving Pronunciation Modeling for Non-Native Speech Recognition
    Tan, Tien-Ping
    Besacier, Laurent
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1801 - 1804
  • [10] Comparison of acoustic model adaptation techniques on non-native speech
    Wang, ZR
    Schultz, T
    Waibel, A
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 540 - 543