Acoustic model adaptation based on pronunciation variability analysis for non-native speech recognition

被引:20
|
作者
Oh, Yoo Rhee [1 ]
Yoon, Jae Sam [1 ]
Kim, Hong Kook [1 ]
机构
[1] Gwangju Inst Sci & Technol, Dept Informat & Commun, Kwangju 500712, South Korea
关键词
speech recognition; non-native speech; knowledge-based pronunciation variability; data-driven pronunciation variability; state-tying; state-clustering; decision tree; acoustic model adaptation;
D O I
10.1016/j.specom.2006.10.006
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, pronunciation variability between native and non-native speakers is investigated, and a novel acoustic model adaptation method is proposed based on pronunciation variability analysis in order to improve the performance of a speech recognition system by non-native speakers. The proposed acoustic model adaptation method is performed in two steps: analysis of the pronunciation variability of non-native speech, and acoustic model adaptation based on the pronunciation variability analysis. In order to obtain informative variant phonetic units, we analyze the pronunciation variability of non-native speech in two ways: a knowledge-based approach, and a data-driven approach. Next, for each approach, the acoustic model corresponding to each informative variant phonetic unit is adapted such that the state-tying of the acoustic model for non-native speech reflects a phonetic variability. For further improvement, a conventional acoustic model adaptation method such as MLLR and/or MAP is combined with the proposed acoustic model adaptation method. It is shown from the continuous Korean-English speech recognition experiments that the proposed method achieves an average word error rate reduction of 16.76% and 12.80% for the knowledge-based approach and the data-driven approach, respectively, when compared with the baseline speech recognition system trained by native speech. Moreover, a reduction of 53.45% and 57.14% in the average word error rate is obtained by combining MLLR and MAP adaptations to the adapted acoustic models by the proposed method for the knowledge-based approach and the data-driven approach, respectively. (C) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:59 / 70
页数:12
相关论文
共 50 条
  • [31] Fully automated non-native speech recognition using confusion-based acoustic model integration and graphemic constraints
    Bouselmi, Ghazi
    Fohr, Dominique
    Illina, Irina
    Haton, Jean Paul
    2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 345 - 348
  • [32] Multilingual Non-Native Speech Recognition using Phonetic Confusion-Based Acoustic Model Modification and Graphemic Constraints
    Bouselmi, G.
    Fohr, D.
    Illina, I.
    Haton, J. -P.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 109 - +
  • [33] Adapting the Acoustic Model of a Speech Recognizer for Varied Proficiency Non-Native Spontaneous Speech Using Read Speech with Language-Specific Pronunciation Difficulty
    Zechner, Klaus
    Higgins, Derrick
    Lawless, Rene
    Futagi, Yoko
    Ohls, Sarah
    Ivanov, George
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 612 - 615
  • [34] Adapting the acoustic model of a speech recognizer for varied proficiency non-native spontaneous speech using read speech with language-specific pronunciation difficulty
    Educational Testing Service, Princeton, NJ, United States
    Proc. Annu. Conf. Int. Speech. Commun. Assoc., INTERSPEECH, (604-607):
  • [35] Non-native Listeners' Recognition of High-Variability Speech Using PRESTO
    Tamati, Terrin N.
    Pisoni, David B.
    JOURNAL OF THE AMERICAN ACADEMY OF AUDIOLOGY, 2014, 25 (09) : 869 - 892
  • [36] Non-native speech recognition sentences: A new materials set for non-native speech perception research
    Stringer, Louise
    Iverson, Paul
    BEHAVIOR RESEARCH METHODS, 2020, 52 (02) : 561 - 571
  • [37] Non-native speech recognition sentences: A new materials set for non-native speech perception research
    Louise Stringer
    Paul Iverson
    Behavior Research Methods, 2020, 52 : 561 - 571
  • [38] Automatic Pronunciation Assessment of Non-native English Based on Phonological Analysis
    Rios-Urrego, C. D.
    Escobar-Grisales, D.
    Moreno-Acevedo, S. A.
    Perez-Toro, P. A.
    Noeth, E.
    Orozco-Arroyave, J. R.
    TEXT, SPEECH, AND DIALOGUE, TSD 2023, 2023, 14102 : 339 - 348
  • [39] An acoustic-phonetic analysis of large vocabulary continuous Mandarin speech recognition for non-native speakers
    Yang, J
    Pu, YY
    Wei, H
    Zhao, ZP
    2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 241 - 244
  • [40] Lexical modeling of non-native speech for automatic speech recognition
    Livescu, K
    Glass, J
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1683 - 1686