Context-Independent Multilingual Emotion Recognition from Speech Signals

被引:42
|
作者
Vladimir Hozjan
Zdravko Kačič
机构
[1] University of Maribor,
[2] Faculty of Electrical Engineering and Computer Science,undefined
关键词
emotions; speech; emotion recognition; cross language emotion recognition;
D O I
10.1023/A:1023426522496
中图分类号
学科分类号
摘要
This paper presents and discusses an analysis of multilingual emotion recognition from speech with database-specific emotional features. Recognition was performed on English, Slovenian, Spanish, and French InterFace emotional speech databases. The InterFace databases included several neutral speaking styles and six emotions: disgust, surprise, joy, fear, anger and sadness. Speech features for emotion recognition were determined in two steps. In the first step, low-level features were defined and in the second high-level features were calculated from low-level features. Low-level features are composed from pitch, derivative of pitch, energy, derivative of energy, and duration of speech segments. High-level features are statistical presentations of low-level features. Database-specific emotional features were selected from high-level features that contain the most information about emotions in speech. Speaker-dependent and monolingual emotion recognisers were defined, as well as multilingual recognisers. Emotion recognition was performed using artificial neural networks. The achieved recognition accuracy was highest for speaker-dependent emotion recognition, smaller for monolingual emotion recognition and smallest for multilingual recognition. The database-specific emotional features are most convenient for use in multilingual emotion recognition. Among speaker-dependent, monolingual, and multilingual emotion recognition, the difference between emotion recognition with all high-level features and emotion recognition with database-specific emotional features is smallest for multilingual emotion recognition—3.84%.
引用
收藏
页码:311 / 320
页数:9
相关论文
共 50 条
  • [1] Context-independent acoustic models for Thai speech recognition
    Kasuriya, S
    Kanokphara, S
    Thatphithakkul, N
    Cotsomrong, P
    Sunpethniyom, T
    IEEE INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES 2004 (ISCIT 2004), PROCEEDINGS, VOLS 1 AND 2: SMART INFO-MEDIA SYSTEMS, 2004, : 991 - 994
  • [2] Emotion Recognition from Speech Signal in Multilingual
    Albu, Corina
    Lupu, Eugen
    Arsinte, Radu
    6TH INTERNATIONAL CONFERENCE ON ADVANCEMENTS OF MEDICINE AND HEALTH CARE THROUGH TECHNOLOGY, MEDITECH 2018, 2019, 71 : 157 - 161
  • [3] MULTI-SPEAKER AND CONTEXT-INDEPENDENT ACOUSTICAL CUES FOR AUTOMATIC SPEECH RECOGNITION
    ROSSI, M
    NISHINUMA, Y
    MERCIER, G
    SPEECH COMMUNICATION, 1983, 2 (2-3) : 215 - 217
  • [4] Emotion recognition from Madarin speech signals
    Pao, TL
    Chen, YT
    Yeh, JH
    2004 INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2004, : 301 - 304
  • [5] Separability and recognition of emotion states in multilingual speech
    Jiang, XQ
    Tian, L
    Han, M
    2005 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, VOLS 1 AND 2, PROCEEDINGS: VOL 1: COMMUNICATION THEORY AND SYSTEMS, 2005, : 861 - 864
  • [6] Integrating Language and Emotion Features for Multilingual Speech Emotion Recognition
    Heracleous, Panikos
    Mohammad, Yasser
    Yoneyama, Akio
    HUMAN-COMPUTER INTERACTION. MULTIMODAL AND NATURAL INTERACTION, HCI 2020, PT II, 2020, 12182 : 187 - 196
  • [7] Emotion recognition and evaluation from Mandarin speech signals
    Pao, Tsanglong
    Chen, Yute
    Yeh, Junheng
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2008, 4 (07): : 1695 - 1709
  • [8] Improving Automatic Emotion Recognition from Speech Signals
    Bozkurt, Elif
    Erzin, Engin
    Erdem, Cigdem Eroglu
    Erdem, A. Tanju
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 312 - +
  • [9] CONTEXT-INDEPENDENT SOLVERS
    Marti, Rafael
    23RD EUROPEAN CONFERENCE ON MODELLING AND SIMULATION (ECMS 2009), 2009, : 7 - 8
  • [10] Enhancing multilingual recognition of emotion in speech by language identification
    Sagha, Hesam
    Matejka, Pavel
    Gavryukova, Maryna
    Povolny, Filip
    Marchi, Erik
    Schuller, Bjoern
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2949 - 2953