Speech recognition using voice-characteristic-dependent acoustic models

被引:0
|
作者
Suzuki, H [1 ]
Zen, H [1 ]
Nankaku, Y [1 ]
Miyajima, C [1 ]
Tokuda, K [1 ]
Kitamura, T [1 ]
机构
[1] Nagoya Inst Technol, Dept Comp Sci, Nagoya, Aichi 4668555, Japan
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a speech recognition technique based on acoustic models considering voice characteristic variations. Context-dependent acoustic models, which are typically triphone HMMs, are often used in continuous speech recognition systems. This work hypothesizes that the speaker voice characteristics that humans can perceive by listening are also factors in acoustic variation for construction of acoustic models, and a tree-based clustering technique is also applied to speaker voice characteristics to construct voice-characteristic-dependent acoustic models. In speech recognition using triphone models, the neighboring phonetic context is given from the linguistic-phonetic knowledge. in advance; in contrast, the voice characteristics of input speech are unknown in recognition using voice-characteristic-dependent acoustic models. This paper proposes a method of recognizing speech even under conditions where the voice characteristics of the input speech are unknown. The result of a gender-dependent speech recognition experiment shows that the proposed method achieves higher recognition performance in comparison to conventional methods.
引用
收藏
页码:740 / 743
页数:4
相关论文
共 50 条
  • [41] Acoustic and Language Models Adaptation for Indonesian Spontaneous Speech Recognition
    Lestari, Dessi Puji
    Irfani, Angela
    2015 2ND INTERNATIONAL CONFERENCE ON ADVANCED INFORMATICS: CONCEPTS, THEORY AND APPLICATIONS ICAICTA, 2015,
  • [42] A Study on the Generalization Capability of Acoustic Models for Robust Speech Recognition
    Xiao, Xiong
    Li, Jinyu
    Chng, Eng Siong
    Li, Haizhou
    Lee, Chin-Hui
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1158 - 1169
  • [43] Building DNN acoustic models for large vocabulary speech recognition
    Maas, Andrew L.
    Qi, Peng
    Xie, Ziang
    Hannun, Awni Y.
    Lengerich, Christopher T.
    Jurafsky, Daniel
    Ng, Andrew Y.
    COMPUTER SPEECH AND LANGUAGE, 2017, 41 : 195 - 213
  • [44] Acoustic Modelling for Speech Recognition: Hidden Markov Models and Beyond?
    Gales, M. J. F.
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 44 - 44
  • [45] Training wideband acoustic models using mixed-bandwidth training data for speech recognition
    Seltzer, Michael L.
    Acero, Alex
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (01): : 235 - 245
  • [46] Non-native English speech recognition using bilingual English lexicon and acoustic models
    Matsunaga, S
    Ogawa, A
    Yamaguchi, Y
    Imamura, A
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 625 - 628
  • [47] The Effects of Noise on Speech Recognition in Cochlear Implant Subjects: Predictions and Analysis Using Acoustic Models
    Jeremiah J. Remus
    Leslie M. Collins
    EURASIP Journal on Advances in Signal Processing, 2005
  • [48] Non-native English speech recognition using bilingual english lexicon and acoustic models
    Matsunaga, S
    Ogawa, A
    Yamaguchi, Y
    Imamura, A
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 340 - 343
  • [49] The effects of noise on speech recognition in cochlear implant subjects: Predictions and analysis using acoustic models
    Remus, JJ
    Collins, LM
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2005, 2005 (18) : 2979 - 2990
  • [50] SPEAKER CONDITIONING OF ACOUSTIC MODELS USING AFFINE TRANSFORMATION FOR MULTI-SPEAKER SPEECH RECOGNITION
    Yousefi, Midia
    Hansen, John H. L.
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 283 - 288