Empirical Interpretation of Speech Emotion Perception with Attention Based Model for Speech Emotion Recognition

被引:19
|
作者
Jalal, Md Asif [1 ]
Milner, Rosanna [1 ]
Hain, Thomas [1 ]
机构
[1] Univ Sheffield, Speech & Hearing Grp SPandH, Sheffield, S Yorkshire, England
来源
关键词
speech emotion recognition; speech emotion intelligibility; computational paralinguistics;
D O I
10.21437/Interspeech.2020-3007
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Speech emotion recognition is essential for obtaining emotional intelligence which affects the understanding of context and meaning of speech. Harmonically structured vowel and consonant sounds add indexical and linguistic cues in spoken information. Previous research argued whether vowel sound cues were more important in carrying the emotional context from a psychological and linguistic point of view. Other research also claimed that emotion information could exist in small overlapping acoustic cues. However, these claims are not corroborated in computational speech emotion recognition systems. In this research, a convolution-based model and a long-short-term memory-based model, both using attention, are applied to investigate these theories of speech emotion on computational models. The role of acoustic context and word importance is demonstrated for the task of speech emotion recognition. The IEMOCAP corpus is evaluated by the proposed models, and 80.1% unweighted accuracy is achieved on pure acoustic data which is higher than current state-of-the-art models on this task. The phones and words are mapped to the attention vectors and it is seen that the vowel sounds are more important for defining emotion acoustic cues than the consonants, and the model can assign word importance based on acoustic context.
引用
收藏
页码:4113 / 4117
页数:5
相关论文
共 50 条
  • [1] Speech emotion recognition based on emotion perception
    Gang Liu
    Shifang Cai
    Ce Wang
    EURASIP Journal on Audio, Speech, and Music Processing, 2023
  • [2] Speech emotion recognition based on emotion perception
    Liu, Gang
    Cai, Shifang
    Wang, Ce
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
  • [3] Speech Emotion Recognition Based on Speech Segment Using LSTM with Attention Model
    Atmaja, Bagus Tris
    Akagi, Masato
    2019 IEEE INTERNATIONAL CONFERENCE ON SIGNALS AND SYSTEMS (ICSIGSYS), 2019, : 40 - 44
  • [4] Emotion Perception and Recognition from Speech
    Wu, Chung-Hsien
    Yeh, Jui-Feng
    Chuang, Ze-Jing
    AFFECTIVE INFORMATION PROCESSING, 2009, : 93 - +
  • [5] Speech emotion recognition based on listener-dependent emotion perception models
    Ando, Atsushi
    Mori, Takeshi
    Kobashikawa, Satoshi
    Toda, Tomoki
    APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2021, 10
  • [6] Speech emotion recognition using emotion perception spectral feature
    Jiang, Lin
    Tan, Ping
    Yang, Junfeng
    Liu, Xingbao
    Wang, Chao
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (11):
  • [7] Speech emotion recognition based on an improved brain emotion learning model
    Liu, Zhen-Tao
    Xie, Qiao
    Wu, Min
    Cao, Wei-Hua
    Mei, Ying
    Mao, Jun-Wei
    NEUROCOMPUTING, 2018, 309 : 145 - 156
  • [8] The Impact of Attention Mechanisms on Speech Emotion Recognition
    Chen, Shouyan
    Zhang, Mingyan
    Yang, Xiaofen
    Zhao, Zhijia
    Zou, Tao
    Sun, Xinqi
    SENSORS, 2021, 21 (22)
  • [9] Self-attention for Speech Emotion Recognition
    Tarantino, Lorenzo
    Garner, Philip N.
    Lazaridis, Alexandros
    INTERSPEECH 2019, 2019, : 2578 - 2582
  • [10] Attention Based Fully Convolutional Network for Speech Emotion Recognition
    Zhang, Yuanyuan
    Du, Jun
    Wang, Zirui
    Zhang, Jianshu
    Tu, Yanhui
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1771 - 1775