Contextual invariant-integration features for improved speaker-independent speech recognition

被引:18
|
作者
Mueller, Florian [1 ]
Mertins, Alfred [1 ]
机构
[1] Med Univ Lubeck, Inst Signal Proc, D-23538 Lubeck, Germany
关键词
Speech recognition; Speaker-independency; Invariant-integration; TRANSFORMATION;
D O I
10.1016/j.specom.2011.02.002
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This work presents a feature-extraction method that is based on the theory of invariant integration. The invariant-integration features are derived from an extended time period, and their computation has a very low complexity. Recognition experiments show a superior performance of the presented feature type compared to cepstral coefficients using a mel filterbank (MFCCs) or a gammatone filterbank (GTCCs) in matching as well as in mismatching training-testing conditions. Even without any speaker adaptation, the presented features yield accuracies that are larger than for MFCCs combined with vocal tract length normalization (VTLN) in matching training-test conditions. Also, it is shown that the invariant-integration features (IIFs) can be successfully combined with additional speaker-adaptation methods to further increase the accuracy. In addition to standard MFCCs also contextual MFCCs are introduced. Their performance lies between the one of MFCCs and IIFs. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:830 / 841
页数:12
相关论文
共 50 条
  • [41] DSP-based large vocabulary speaker-independent speech recognition
    Hirayama, H
    Yoshida, K
    Koga, S
    Hattori, H
    NEC RESEARCH & DEVELOPMENT, 1996, 37 (04): : 528 - 534
  • [42] SPEAKER-INDEPENDENT SPEECH-RECOGNITION SYSTEM BASED ON LINEAR PREDICTION
    GUPTA, VN
    BRYAN, JK
    GOWDY, JN
    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1978, 26 (01): : 27 - 33
  • [43] A HMM-based integrated method for speaker-independent speech recognition
    Zhang, YY
    Zhu, XY
    ICSP '98: 1998 FOURTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1998, : 613 - 616
  • [44] Speaker-independent embedded speech recognition using Hidden Markov Models
    Marufo da Silva, Mariano
    Evin, Diego A.
    Verrastro, Sebastian
    IEEE CACIDI 2016 - IEEE CONFERENCE ON COMPUTER SCIENCES, 2016,
  • [45] Speaker-Independent Silent Speech Recognition with Across-Speaker Articulatory Normalization and Speaker Adaptive Training
    Wang, Jun
    Hahm, Seongjun
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2415 - 2419
  • [46] IMPROVED SPEAKER-INDEPENDENT EMOTION RECOGNITION FROM SPEECH USING TWO-STAGE FEATURE REDUCTION
    Nazid, Hasrul Mohd
    Muthusamy, Hariharan
    Vijean, Vikneswaran
    Yaacob, Sazali
    JOURNAL OF INFORMATION AND COMMUNICATION TECHNOLOGY-MALAYSIA, 2015, 14 : 57 - 76
  • [47] Speaker-independent recognition of Chinese tones
    GUAN Cuntai and CHEN Yongbin(Dep. of Radio Eng.
    Chinese Journal of Acoustics, 1993, (02) : 142 - 148
  • [48] Speaker-Invariant Features for Automatic Speech Recognition
    Umesh, S.
    Sanand, D. R.
    Praveen, G.
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1738 - 1743
  • [49] Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition
    Itoh, Arata
    Hara, Sunao
    Kitaoka, Norihide
    Takeda, Kazuya
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (10): : 2479 - 2485
  • [50] SPEAKER-INDEPENDENT DIGIT RECOGNITION SYSTEM
    SAMBUR, MR
    RABINER, LR
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1974, 56 : S26 - S26