Speech recognition using voice-characteristic-dependent acoustic models

被引:0
|
作者
Suzuki, H [1 ]
Zen, H [1 ]
Nankaku, Y [1 ]
Miyajima, C [1 ]
Tokuda, K [1 ]
Kitamura, T [1 ]
机构
[1] Nagoya Inst Technol, Dept Comp Sci, Nagoya, Aichi 4668555, Japan
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a speech recognition technique based on acoustic models considering voice characteristic variations. Context-dependent acoustic models, which are typically triphone HMMs, are often used in continuous speech recognition systems. This work hypothesizes that the speaker voice characteristics that humans can perceive by listening are also factors in acoustic variation for construction of acoustic models, and a tree-based clustering technique is also applied to speaker voice characteristics to construct voice-characteristic-dependent acoustic models. In speech recognition using triphone models, the neighboring phonetic context is given from the linguistic-phonetic knowledge. in advance; in contrast, the voice characteristics of input speech are unknown in recognition using voice-characteristic-dependent acoustic models. This paper proposes a method of recognizing speech even under conditions where the voice characteristics of the input speech are unknown. The result of a gender-dependent speech recognition experiment shows that the proposed method achieves higher recognition performance in comparison to conventional methods.
引用
收藏
页码:740 / 743
页数:4
相关论文
共 50 条
  • [31] Phone-context specific gender-dependent acoustic-models for continuous speech recognition
    Neti, C
    Roukos, S
    1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 192 - 198
  • [32] Prosody-dependent Acoustic Modeling for Mandarin Speech Recognition
    Chiu, Tzu-Hsuan
    Chiang, Chen-Yu
    Liao, Yuan-Fu
    Yang, Jyh-Her
    Wang, Yih-Ru
    Chen, Sin-Horng
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPEECH PROSODY, VOLS I AND II, 2012, : 139 - 142
  • [33] Voice Analysis Using Acoustic and Throat Microphones for Speech Therapy
    Mathew, Lani Rachel
    Gopakumar, K.
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 173 - 174
  • [34] Multilingual acoustic models for the recognition of non-native speech
    Fischer, V
    Janke, E
    Kunzmann, S
    Ross, T
    ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 331 - 334
  • [35] Decision tree-based acoustic models for speech recognition
    Masami Akamine
    Jitendra Ajmera
    EURASIP Journal on Audio, Speech, and Music Processing, 2012
  • [36] SPEECH RECOGNITION - ACOUSTIC, PHONETIC AND FORMAL-LANGUAGE MODELS
    MERMELSTEIN, P
    LEVINSON, S
    BIOTELEMETRY, 1975, 2 (1-2) : 121 - 123
  • [37] Speech Recognition with Factorial-HMM Syllabic Acoustic Models
    Coro, Gianpaolo
    Cutugno, Francesco
    Caropreso, Fulvio
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1497 - 1500
  • [38] Boosting HMM acoustic models in large vocabulary speech recognition
    Meyer, C
    Schramm, H
    SPEECH COMMUNICATION, 2006, 48 (05) : 532 - 548
  • [39] Decision tree-based acoustic models for speech recognition
    Akamine, Masami
    Ajmera, Jitendra
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2012,
  • [40] Context-independent acoustic models for Thai speech recognition
    Kasuriya, S
    Kanokphara, S
    Thatphithakkul, N
    Cotsomrong, P
    Sunpethniyom, T
    IEEE INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES 2004 (ISCIT 2004), PROCEEDINGS, VOLS 1 AND 2: SMART INFO-MEDIA SYSTEMS, 2004, : 991 - 994