Using multiple acoustic feature sets for speech recognition

被引:20
|
作者
Zolnay, Andras [1 ]
Kocharov, Daniil
Schlueter, Ralf
Ney, Hermann
机构
[1] Univ Aachen, Rhein Westfal TH Aachen, Lehrsuthl Informat 6, Dept Comp Sci, D-52056 Aachen, Germany
[2] St Petersburg State Univ, Dept Phonet, St Petersburg 199034, Russia
关键词
acoustic feature extraction; auditory features; articulatory features; voicing; spectrum derivative feature; linear discriminant analysis; discriminative model combination;
D O I
10.1016/j.specom.2007.04.005
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, the use of multiple acoustic feature sets for speech recognition is investigated. The combination of both auditory as well as articulatory motivated features is considered. In addition to a voicing feature, we introduce a recently developed articulatory motivated feature, the spectrum derivative feature. Features are combined both directly using linear discriminant analysis (LDA) as well as indirectly on model level using discriminative model combination (DMC). Experimental results are presented for both small- and large-vocabulary tasks. The results show that the accuracy of automatic speech recognition systems can be significantly improved by the combination of auditory and articulatory motivated features. The word error rate is reduced from 1.8% to 1.5% on the SieTill task for German digit string recognition. Consistent improvements in word error rate have been obtained on two large-vocabulary corpora. The word error rate is reduced from 19.1% to 18.4% on the VerbMobil II corpus, a German large-vocabulary conversational speech task, and from 14.1% to 13.5% on the British English part of the European parliament plenary sessions (EPPS) task from the 2005 TC-STAR ASR evaluation campaign. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:514 / 525
页数:12
相关论文
共 50 条
  • [1] Facial Expression Recognition using Multiple Feature Sets
    Shaukat, Arslan
    Aziz, Mansoor
    Akram, Usman
    2015 5TH INTERNATIONAL CONFERENCE ON IT CONVERGENCE AND SECURITY (ICITCS), 2015,
  • [2] A novel feature-extraction for speech recognition based on multiple acoustic-feature planes
    Nitta, T
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 29 - 32
  • [3] Acoustic feature combination for robust speech recognition
    Zolnay, A
    Schlüter, R
    Ney, H
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 457 - 460
  • [4] Feature sets in continuous speech recognition for the Portuguese language
    dos Santos, SCB
    Alcaim, A
    ITS '98 PROCEEDINGS - SBT/IEEE INTERNATIONAL TELECOMMUNICATIONS SYMPOSIUM, VOLS 1 AND 2, 1998, : 126 - 129
  • [5] Selective Acoustic Feature Enhancement for Speech Emotion Recognition With Noisy Speech
    Leem, Seong-Gyun
    Fulford, Daniel
    Onnela, Jukka-Pekka
    Gard, David
    Busso, Carlos
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 917 - 929
  • [6] Survey on Acoustic Modeling and Feature Extraction for Speech Recognition
    Garg, Anjali
    Sharma, Poonam
    PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 2291 - 2295
  • [7] An optimal two stage feature selection for speech emotion recognition using acoustic features
    Kuchibhotla S.
    Vankayalapati H.D.
    Anne K.R.
    International Journal of Speech Technology, 2016, 19 (4) : 657 - 667
  • [8] Significance of Feature Selection for Acoustic Modeling in Dysarthric Speech Recognition
    Mathew, Jerin Baby
    Jacob, Jonie
    Sajeev, Karun
    Joy, Jithin
    Rajan, Rajeev
    2018 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2018,
  • [9] Speech Emotion Recognition Based on Multi Acoustic Feature Fusion
    Xiang, Shanshan
    Anwer, Sadiyagul
    Yilahun, Hankiz
    Hamdulla, Askar
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 338 - 346
  • [10] Acoustic feature selection for automatic emotion recognition from speech
    Rong, Jia
    Li, Gang
    Chen, Yi-Ping Phoebe
    INFORMATION PROCESSING & MANAGEMENT, 2009, 45 (03) : 315 - 328