Implicit modelling of pronunciation variation in automatic speech recognition

被引:27
|
作者
Hain, T [1 ]
机构
[1] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
关键词
automatic speech recognition; pronunciation modelling; acoustic modelling; hidden markov models; pronunciation dictionaries; single pronunciations; parameter tying; phonetic decision trees; state clustering; conversational speech recognition; Hidden Model Sequence Models;
D O I
10.1016/j.specom.2005.03.008
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Modelling of pronunciation variability is an important task for the acoustic model of an automatic speech recognition system. Good pronunciation models contribute to the robustness and generic applicability of a speech recogniser. Usually pronunciation modelling is associated with a lexicon that allows to explicitly control the selection of appropriate HMMs for a particular word. However, the use of data-driven clustering techniques or specific parameter tying techniques has considerable impact on this form of model selection and the construction of a task-optimal dictionary. Most large vocabulary speech recognition systems make use of a dictionary with multiple possible pronunciation variants per word. By manual addition of pronunciation variants explicit human knowledge is used in the recognition process. For reasons of complexity the optimisation of manual entries for performance is often not feasible. In this paper a method for the stepwise reduction of the number of pronunciation variants per word to one is described. By doing so in a way consistent with the classification procedure, pronunciation variation is modelled implicitly. It is shown that the use of single pronunciation dictionaries provides similar or better word error rate performance, achieved both on Wall Street Journal and Switchboard data. The use of single pronunciation dictionaries in conjunction with Hidden Model Sequence Models as an example of an implicit pronunciation modelling technique shows further improvements. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:171 / 188
页数:18
相关论文
共 50 条
  • [1] Special issue on modeling pronunciation variation for automatic speech recognition
    Strik, H
    SPEECH COMMUNICATION, 1999, 29 (2-4) : 81 - 82
  • [2] AUTOMATIC PRONUNCIATION VERIFICATION FOR SPEECH RECOGNITION
    Rao, Kanishka
    Peng, Fuchun
    Beaufays, Francoise
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5162 - 5166
  • [3] Automatic Speech Recognition and Pronunciation Training
    Xiao, Wenqi
    PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON EDUCATION, ECONOMICS AND MANAGEMENT RESEARCH (ICEEMR 2018), 2018, 182 : 466 - 468
  • [4] Automatic modelling of regional pronunciation variation for Russian
    Shalonova, KB
    TEXT, SPEECH AND DIALOGUE, 1999, 1692 : 329 - 332
  • [5] Pronunciation learner autonomy: The potential of Automatic Speech Recognition
    McCrocklin, Shannon M.
    SYSTEM, 2016, 57 : 25 - 42
  • [6] Pronunciation change in conversational speech and its implications for automatic speech recognition
    Saraçlar, M
    Khudanpur, S
    COMPUTER SPEECH AND LANGUAGE, 2004, 18 (04): : 375 - 395
  • [7] Automatic speech recognition and intrinsic speech variation
    Benzeguiba, M.
    De Mori, R.
    Deroo, O.
    Dupont, S.
    Erbes, T.
    Jouvet, D.
    Fissore, L.
    Laface, R.
    Mertins, A.
    Ris, C.
    Rose, R.
    Tyagi, V.
    Wellekens, C.
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 5879 - 5882
  • [8] IMPLICIT TRAJECTORY MODELLING USING TEMPORALLY VARYING WEIGHT REGRESSION FOR AUTOMATIC SPEECH RECOGNITION
    Liu, Shilin
    Sim, Khe Chai
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4761 - 4764
  • [9] AUTOMATIC EVALUATION OF ENGLISH PRONUNCIATION BASED ON SPEECH RECOGNITION TECHNIQUES
    HAMADA, H
    MIKI, S
    NAKATSU, R
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1993, E76D (03) : 352 - 359
  • [10] Automatic evaluation of Dutch pronunciation by using speech recognition technology
    Cucchiarini, C
    Strik, H
    Boves, L
    1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 622 - 629