Speaker normalisation for speech-based emotion detection

被引:32
|
作者
Sethu, Vidhyasaharan [1 ,2 ]
Ambikairajah, Eliathainby [1 ,2 ]
Epps, Julien [1 ,3 ]
机构
[1] Univ New S Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia
[2] NICTA, Sydney, NSW, Australia
[3] UNSW Asia, Singapore 248922, Singapore
关键词
feature warping; cumulative distribution mapping; emotion detection; hidden Markov model;
D O I
10.1109/ICDSP.2007.4288656
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The focus of this paper is on speech-based emotion detection utilising only acoustic data, i.e. without using any linguistic or semantic information. However, this approach in general Suffers from the fact that acoustic data is speaker-dependent, and can result in inefficient estimation of the statistics modelled by classifiers such as hidden Markov models (HMMs) and Gaussian mixture models (GMMs). We propose the use of speaker-specific feature warping as a means of normalising acoustic features to overcome the problem of speaker dependency. In this paper we compare the performance of a system that uses feature warping to one that does not, The back-end employs ail HMM-based classifier that captures the temporal variations of the feature vectors by modelling them as transitions between different states. Evaluations conducted oil the LDC Emotional Prosody speech corpus reveal a relative increase in classification accuracy of up to 20%.
引用
收藏
页码:611 / +
页数:2
相关论文
共 50 条
  • [21] Speaker Turn Aware Similarity Scoring for Diarization of Speech-Based Cognitive Assessments
    Xu, Sean Shensheng
    Mak, Man-Wai
    Wong, Ka Ho
    Meng, Helen
    Kwok, Timothy C. Y.
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1299 - 1304
  • [22] Robust Multi-Scenario Speech-Based Emotion Recognition System
    Fangfang Zhu-Zhou
    Gil-Pita, Roberto
    Garcia-Gomez, Joaquin
    Rosa-Zurera, Manuel
    SENSORS, 2022, 22 (06)
  • [23] Multicriteria Neural Network Design in the Speech-based Emotion Recognition Problem
    Brester, Christina
    Semenkin, Eugene
    Sidorov, Maxim
    Semenkina, Olga
    ICIMCO 2015 PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, VOL. 1, 2015, : 621 - 628
  • [24] Enhanced multiclass SVM with thresholding fusion for speech-based emotion classification
    Yang N.
    Yuan J.
    Zhou Y.
    Demirkol I.
    Duan Z.
    Heinzelman W.
    Sturge-Apple M.
    International Journal of Speech Technology, 2017, 20 (01) : 27 - 41
  • [25] Speech-based Emotion Recognition: Application of Collective Decision Making Concepts
    Brester, Christina
    Semenkin, Eugene
    Sidorov, Maxim
    INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (ICCSAI 2014), 2015, : 216 - 220
  • [26] Contemporary Stochastic Feature Selection Algorithms for Speech-based Emotion Recognition
    Sidorov, Maxim
    Brester, Christina
    Schmitt, Alexander
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2699 - 2703
  • [27] Speaker-dependent automatic helium speech normalisation
    Podhorski, A
    Sawicki, J
    Brykalski, A
    ICECS 2000: 7TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS & SYSTEMS, VOLS I AND II, 2000, : 282 - 285
  • [28] A Path Signature Approach for Speech-Based Dementia Detection
    Pan, Yilin
    Lu, Mingyu
    Shi, Yanpei
    Zhang, Haiyang
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2880 - 2884
  • [29] Towards Speech-Based Collaboration Detection in a Noisy Classroom
    Shahrokhian, Bahar
    VanLehn, Kurt
    ARTIFICIAL INTELLIGENCE IN EDUCATION: POSTERS AND LATE BREAKING RESULTS, WORKSHOPS AND TUTORIALS, INDUSTRY AND INNOVATION TRACKS, PRACTITIONERS AND DOCTORAL CONSORTIUM, PT II, 2022, 13356 : 65 - 70
  • [30] Speech-based services
    Furman, DS
    Cosky, MJ
    Thomson, DL
    O'Brien, SA
    Sumner, EE
    BELL LABS TECHNICAL JOURNAL, 1999, 4 (02) : 88 - 97