Empirical Interpretation of Speech Emotion Perception with Attention Based Model for Speech Emotion Recognition

被引:19
|
作者
Jalal, Md Asif [1 ]
Milner, Rosanna [1 ]
Hain, Thomas [1 ]
机构
[1] Univ Sheffield, Speech & Hearing Grp SPandH, Sheffield, S Yorkshire, England
来源
关键词
speech emotion recognition; speech emotion intelligibility; computational paralinguistics;
D O I
10.21437/Interspeech.2020-3007
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Speech emotion recognition is essential for obtaining emotional intelligence which affects the understanding of context and meaning of speech. Harmonically structured vowel and consonant sounds add indexical and linguistic cues in spoken information. Previous research argued whether vowel sound cues were more important in carrying the emotional context from a psychological and linguistic point of view. Other research also claimed that emotion information could exist in small overlapping acoustic cues. However, these claims are not corroborated in computational speech emotion recognition systems. In this research, a convolution-based model and a long-short-term memory-based model, both using attention, are applied to investigate these theories of speech emotion on computational models. The role of acoustic context and word importance is demonstrated for the task of speech emotion recognition. The IEMOCAP corpus is evaluated by the proposed models, and 80.1% unweighted accuracy is achieved on pure acoustic data which is higher than current state-of-the-art models on this task. The phones and words are mapped to the attention vectors and it is seen that the vowel sounds are more important for defining emotion acoustic cues than the consonants, and the model can assign word importance based on acoustic context.
引用
收藏
页码:4113 / 4117
页数:5
相关论文
共 50 条
  • [21] Speech Emotion Recognition Based on Acoustic Segment Model
    Zheng, Siyuan
    Du, Jun
    Zhou, Hengshun
    Bai, Xue
    Lee, Chin-Hui
    Li, Shipeng
    2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [22] Speech Emotion Recognition based on Multi-Label Emotion Existence Model
    Ando, Atsushi
    Masumura, Ryo
    Kamiyama, Havana
    Kobashikawa, Satoshi
    Aono, Yushi
    INTERSPEECH 2019, 2019, : 2818 - 2822
  • [23] Speech Emotion Recognition
    Lalitha, S.
    Madhavan, Abhishek
    Bhushan, Bharath
    Saketh, Srinivas
    2014 INTERNATIONAL CONFERENCE ON ADVANCES IN ELECTRONICS, COMPUTERS AND COMMUNICATIONS (ICAECC), 2014,
  • [24] Transformer-like model with linear attention for speech emotion recognition
    Du, Jing
    Tang, Manting
    Zhao, Li
    Journal of Southeast University (English Edition), 2021, 37 (02): : 164 - 170
  • [25] Emotion recognition of speech based on RNN
    Park, CH
    Lee, DW
    Sim, KB
    2002 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-4, PROCEEDINGS, 2002, : 2210 - 2213
  • [26] Auditory attention model based on Chirplet for cross-corpus speech emotion recognition
    Zhang X.
    Song P.
    Zha C.
    Tao H.
    Zhao L.
    Zhao, Li (zhaoli@seu.edu.cn), 1600, Southeast University (32): : 402 - 407
  • [27] Informative Speech Features based on Emotion Classes and Gender in Explainable Speech Emotion Recognition
    Yildirim, Huseyin Ediz
    Iren, Deniz
    2023 11TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS, ACIIW, 2023,
  • [28] Sparse Autoencoder with Attention Mechanism for Speech Emotion Recognition
    Sun, Ting-Wei
    Wu, An-Yeu
    2019 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2019), 2019, : 146 - 149
  • [29] Multi-modal Attention for Speech Emotion Recognition
    Pan, Zexu
    Luo, Zhaojie
    Yang, Jichen
    Li, Haizhou
    INTERSPEECH 2020, 2020, : 364 - 368
  • [30] An Attention Pooling based Representation Learning Method for Speech Emotion Recognition
    Li, Pengcheng
    Song, Yan
    McLoughlin, Ian
    Guo, Wu
    Dai, Lirong
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3087 - 3091