Front-End Feature Compensation for Noise Robust Speech Emotion Recognition

被引:1
|
作者
Pandharipande, Meghna [1 ]
Chakraborty, Rupayan [1 ]
Panda, Ashish [1 ]
Das, Biswajit [1 ]
Kopparapu, Sunil Kumar [1 ]
机构
[1] TCS Res & Innovat Mumbai, Yantra Pk, Thana 400601, Maharashtra, India
关键词
Emotion recognition; Noisy speech; Feature compensation; Auditory masking; Vector Taylor Series;
D O I
10.23919/eusipco.2019.8902981
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Robust feature compensation and selection are important aspects of noisy speech emotion recognition (SER) task, especially in mismatched condition, when the models are trained on clean speech and tested in the noisy scenarios. Here we propose the use of front-end feature compensation techniques based on Vector Taylor Series (VTS) expansion and VTS with auditory masking (VTS-AM) to improve the performance of SER systems. On top of VTS and VTS-AM, we compare the performances of log-compression and root-compression to the mel-filter-bank energies. Further, we demonstrate the benefit of feature selection applied to the non-MFCC high-level descriptors in conjunction with VTS, VTS-AM and root compression. The system performance is compared with popular Non-negative Matrix Factorization (NMF) based enhancement and energy based voice activity detector (VAD) technique, which discards silence or noisy frames in the spoken utterances. To demonstrate the efficacy of our proposed techniques, extensive experiments are conducted on 2 standard datasets (EmoDB and IEMOCAP), contaminated with 5 types of noise (Babble, F-16, Factory, Volvo, and HF-channel) from the Noisex-92 noise database at 5 SNR levels (0dB, 5dB, 10dB, 15dB and 20dB).
引用
收藏
页数:5
相关论文
共 50 条
  • [41] A Front-End Technique for Automatic Noisy Speech Recognition
    Naing, Hay Mar Soe
    Hidayat, Risanuri
    Hartanto, Rudy
    Miyanaga, Yoshikazu
    PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 49 - 54
  • [42] JOINT TRAINING OF FRONT-END AND BACK-END DEEP NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Gao, Tian
    Du, Jun
    Dai, Li-Rong
    Lee, Chin-Hui
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4375 - 4379
  • [43] Speech Separation with EMD as Front-End for Noise Robust Co-Channel Speaker Identification
    Kumar, Prasanna M. K.
    Kumaraswamy, R.
    2016 INTERNATIONAL CONFERENCE ON CIRCUITS, CONTROLS, COMMUNICATIONS AND COMPUTING (I4C), 2016,
  • [44] Robust front-end for speech recognition based on computational auditory scene analysis and speaker model
    Guan, Yong
    Li, Peng
    Liu, Wen-Ju
    Xu, Bo
    Zidonghua Xuebao/ Acta Automatica Sinica, 2009, 35 (04): : 410 - 416
  • [45] Recognizing voice aver IP:: A robust front-end for speech recognition on the World Wide Web
    Peláez-Moreno, C
    Gallardo-Antolín, A
    Díaz-De-María, F
    IEEE TRANSACTIONS ON MULTIMEDIA, 2001, 3 (02) : 209 - 218
  • [46] A New Subband-Weighted MVDR-Based Front-End for Robust Speech Recognition
    Seyedin, Sanaz
    Ahadi, Seyed Mohammad
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (08): : 2252 - 2261
  • [47] Robust automatic speech recognition using a multi-channel signal separation front-end
    Yen, KC
    Zhao, YX
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1337 - 1340
  • [48] Speech Feature Compensation Based on Pseudo Stereo Codebooks for Robust Speech Recognition in Additive Noise Environments
    Hsieh, Tsung-hsueh
    Hung, Jeih-weih
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2400 - 2403
  • [49] Noise reduction and echo cancellation front-end for speech codecs
    Basbug, F
    Swaminathan, K
    Nandkumar, S
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (01): : 1 - 13
  • [50] Residual noise compensation for robust speech recognition in nonstationary noise
    Yao, KS
    Shi, BE
    Fung, P
    Cao, ZG
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1125 - 1128