The Impact of Attention Mechanisms on Speech Emotion Recognition

被引:20
|
作者
Chen, Shouyan [1 ]
Zhang, Mingyan [1 ]
Yang, Xiaofen [1 ]
Zhao, Zhijia [1 ]
Zou, Tao [1 ]
Sun, Xinqi [1 ]
机构
[1] Guangzhou Univ, Sch Mech & Elect Engn, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
artificial intelligence; speech emotion recognition; attention mechanism; neural networks;
D O I
10.3390/s21227530
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Speech emotion recognition (SER) plays an important role in real-time applications of human-machine interaction. The Attention Mechanism is widely used to improve the performance of SER. However, the applicable rules of attention mechanism are not deeply discussed. This paper discussed the difference between Global-Attention and Self-Attention and explored their applicable rules to SER classification construction. The experimental results show that the Global-Attention can improve the accuracy of the sequential model, while the Self-Attention can improve the accuracy of the parallel model when conducting the model with the CNN and the LSTM. With this knowledge, a classifier (CNN-LSTMx2+Global-Attention model) for SER is proposed. The experiments result show that it could achieve an accuracy of 85.427% on the EMO-DB dataset.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] Emotion Recognition in Video Streams Using Intramodal and Intermodal Attention Mechanisms
    Mocanu, Bogdan
    Tapu, Ruxandra
    ADVANCES IN VISUAL COMPUTING, ISVC 2022, PT II, 2022, 13599 : 295 - 306
  • [42] Lightweight attention mechanisms for EEG emotion recognition for brain computer interface
    Gunda, Naresh Kumar
    Khalaf, Mohammed I.
    Bhatnagar, Shaleen
    Quraishi, Aadam
    Gudala, Leeladhar
    Venkata, Ashok Kumar Pamidi
    Alghayadh, Faisal Yousef
    Alsubai, Shtwai
    Bhatnagar, Vaibhav
    JOURNAL OF NEUROSCIENCE METHODS, 2024, 410
  • [43] Improved ShuffleNet V2 network with attention for speech emotion recognition
    Udeh, Chinonso Paschal
    Chen, Luefeng
    Du, Sheng
    Liu, Yulong
    Li, Min
    Wu, Min
    INFORMATION SCIENCES, 2025, 689
  • [44] AUTOMATIC SPEECH EMOTION RECOGNITION USING RECURRENT NEURAL NETWORKS WITH LOCAL ATTENTION
    Mirsamadi, Seyedmahdad
    Barsoum, Emad
    Zhang, Cha
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2227 - 2231
  • [45] A speech emotion recognition method for the elderly based on feature fusion and attention mechanism
    Jian, Qijian
    Xiang, Min
    Huang, Wei
    THIRD INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION; NETWORK AND COMPUTER TECHNOLOGY (ECNCT 2021), 2022, 12167
  • [46] Attention Assisted Discovery of Sub-Utterance Structure in Speech Emotion Recognition
    Huang, Che-Wei
    Narayanan, Shrikanth S.
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1387 - 1391
  • [47] AN INTERACTION-AWARE ATTENTION NETWORK FOR SPEECH EMOTION RECOGNITION IN SPOKEN DIALOGS
    Yeh, Sung-Lin
    Lin, Yun-Shao
    Lee, Chi-Chun
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6685 - 6689
  • [48] Speech Emotion Recognition Using Multihead Attention in Both Time and Feature Dimensions
    Xie, Yue
    Liang, Ruiyu
    Liang, Zhenlin
    Zhao, Xiaoyan
    Zeng, Wenhao
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (05) : 1098 - 1101
  • [49] Attention-enhanced Connectionist Temporal Classification for Discrete Speech Emotion Recognition
    Zhao, Ziping
    Bao, Zhongtian
    Zhang, Zixing
    Cummins, Nicholas
    Wang, Haishuai
    Schuller, Bjorn W.
    INTERSPEECH 2019, 2019, : 206 - 210
  • [50] Hierarchical convolutional neural networks with post-attention for speech emotion recognition
    Fan, Yonghong
    Huang, Heming
    Han, Henry
    NEUROCOMPUTING, 2025, 615