Contextual invariant-integration features for improved speaker-independent speech recognition

被引:18
|
作者
Mueller, Florian [1 ]
Mertins, Alfred [1 ]
机构
[1] Med Univ Lubeck, Inst Signal Proc, D-23538 Lubeck, Germany
关键词
Speech recognition; Speaker-independency; Invariant-integration; TRANSFORMATION;
D O I
10.1016/j.specom.2011.02.002
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This work presents a feature-extraction method that is based on the theory of invariant integration. The invariant-integration features are derived from an extended time period, and their computation has a very low complexity. Recognition experiments show a superior performance of the presented feature type compared to cepstral coefficients using a mel filterbank (MFCCs) or a gammatone filterbank (GTCCs) in matching as well as in mismatching training-testing conditions. Even without any speaker adaptation, the presented features yield accuracies that are larger than for MFCCs combined with vocal tract length normalization (VTLN) in matching training-test conditions. Also, it is shown that the invariant-integration features (IIFs) can be successfully combined with additional speaker-adaptation methods to further increase the accuracy. In addition to standard MFCCs also contextual MFCCs are introduced. Their performance lies between the one of MFCCs and IIFs. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:830 / 841
页数:12
相关论文
共 50 条
  • [1] Invariant-integration method for robust feature extraction in speaker-independent speech recognition
    Mueller, Florian
    Mertins, Alfred
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2939 - 2942
  • [2] Noise Robust Speaker-Independent Speech Recognition with Invariant-Integration Features Using Power-Bias Subtraction
    Mueller, Florian
    Mertins, Alfred
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1688 - 1691
  • [3] Speaker-Independent Speech Recognition using Visual Features
    Pooventhiran, G.
    Sandeep, A.
    Manthiravalli, K.
    Harish, D.
    Renuka, Karthika D.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (11) : 616 - 620
  • [4] Domain Invariant Feature Learning for Speaker-Independent Speech Emotion Recognition
    Lu, Cheng
    Zong, Yuan
    Zheng, Wenming
    Li, Yang
    Tang, Chuangao
    Schuller, Bjoern W.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2217 - 2230
  • [5] Robust Features for Speaker-Independent Speech Recognition Based on a Certain Class of Translation-Invariant Transformations
    Mueller, Florian
    Mertins, Alfred
    ADVANCES IN NONLINEAR SPEECH PROCESSING, 2010, 5933 : 111 - 119
  • [6] Biomimetic pattern recognition for speaker-independent speech recognition
    Qin, H
    Wang, SJ
    Sun, H
    PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : 1290 - 1294
  • [7] Predictor codebook for speaker-independent speech recognition
    Kawabata, Takeshi
    Systems and Computers in Japan, 1994, 25 (01): : 37 - 46
  • [8] SPEAKER-INDEPENDENT VOWEL RECOGNITION IN PERSIAN SPEECH
    Nazari, Mohammad
    Sayadiyan, Abolghasem
    Valiollahzadeh, Seyyed Majid
    2008 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES: FROM THEORY TO APPLICATIONS, VOLS 1-5, 2008, : 672 - 676
  • [9] PREDICTOR CODEBOOK FOR SPEAKER-INDEPENDENT SPEECH RECOGNITION
    KAWABATA, T
    SYSTEMS AND COMPUTERS IN JAPAN, 1994, 25 (01) : 37 - 46
  • [10] Japanese Speaker-Independent Homonyms Speech Recognition
    Murakami, Jin'ichi
    Hotta, Haseo
    COMPUTATIONAL LINGUISTICS AND RELATED FIELDS, 2011, 27 : 306 - 313