Speech Emotion Recognition Based on Attention MCNN Combined With Gender Information

被引:5
|
作者
Hu, Zhangfang [1 ]
LingHu, Kehuan [1 ]
Yu, Hongling [1 ]
Liao, Chenzhuo [1 ]
机构
[1] Chongqing Univ Posts & Telecommun CQUPT, Key Lab Optoelect Informat Sensing & Technol, Chongqing 400065, Peoples R China
基金
中国国家自然科学基金;
关键词
Emotion recognition; Speech recognition; Mel frequency cepstral coefficient; Gender issues; Feature extraction; Convolutional neural networks; Three-dimensional displays; SER; convolutional neural network; gender information; attention; GRU; FEATURES;
D O I
10.1109/ACCESS.2023.3278106
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Emotion recognition is susceptible to interference such as feature redundancy and speaker gender differences, resulting in low recognition accuracy. This paper proposes a speech emotion recognition (SER) method based on attention mixed convolutional neural network (MCNN) combined with gender information, including two stages of gender recognition and emotion recognition. (1) Using MCNN to identify gender and classify speech samples into male and female. (2) According to the output of the first stage classification, a gender-specific emotion recognition model is established by introducing coordinated attention and a series of gated recurrent network units connecting the attention mechanism (A-GRUs) to achieve emotion recognition results of different genders. The inputs of both stages are dynamic 3D MFCC features generated from the original speech database. The proposed method achieves 95.02% and 86.34% accuracy on EMO-DB and RAVDESS datasets, respectively. The experimental results show that the proposed SER system combined with gender information significantly improves the recognition performance.
引用
收藏
页码:50285 / 50294
页数:10
相关论文
共 50 条
  • [31] English speech emotion recognition method based on speech recognition
    Man Liu
    International Journal of Speech Technology, 2022, 25 : 391 - 398
  • [32] Gender de-biasing in speech emotion recognition
    Gorrostieta, Cristina
    Lotfian, Reza
    Taylor, Kye
    Brutti, Richard
    Kane, John
    INTERSPEECH 2019, 2019, : 2823 - 2827
  • [33] Gender Specific Emotion Recognition Through Speech Signals
    Vinay
    Gupta, Shilpi
    Mehra, Anu
    2014 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2014, : 727 - 733
  • [34] Speech Emotion Recognition with Emotion-Pair based Framework Considering Emotion Distribution Information in Dimensional Emotion Space
    Ma, Xi
    Wu, Zhiyong
    Jia, Jia
    Xu, Mingxing
    Meng, Helen
    Cai, Lianhong
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1238 - 1242
  • [35] Effective Exploitation of Posterior Information for Attention-Based Speech Recognition
    Tang, Jian
    Hou, Junfeng
    Song, Yan
    Dai, Li-Rong
    McLoughlin, Ian
    IEEE ACCESS, 2020, 8 (08): : 108988 - 108999
  • [36] Speech Emotion Recognition Based on Convolution Neural Network combined with Random Forest
    Zheng, Li
    Li, Qiao
    Ban, Hua
    Liu, Shuhua
    PROCEEDINGS OF THE 30TH CHINESE CONTROL AND DECISION CONFERENCE (2018 CCDC), 2018, : 4143 - 4147
  • [37] Emotion recognition of speech based on RNN
    Park, CH
    Lee, DW
    Sim, KB
    2002 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-4, PROCEEDINGS, 2002, : 2210 - 2213
  • [38] Auditory attention model based on Chirplet for cross-corpus speech emotion recognition
    Zhang X.
    Song P.
    Zha C.
    Tao H.
    Zhao L.
    Zhao, Li (zhaoli@seu.edu.cn), 1600, Southeast University (32): : 402 - 407
  • [39] Coordination Attention based Transformers with bidirectional contrastive loss for multimodal speech emotion recognition
    Fan, Weiquan
    Xu, Xiangmin
    Zhou, Guohua
    Deng, Xiaofang
    Xing, Xiaofen
    SPEECH COMMUNICATION, 2025, 169
  • [40] Upgraded Attention-Based Local Feature Learning Block for Speech Emotion Recognition
    Zhao, Huan
    Gao, Yingxue
    Xiao, Yufeng
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT II, 2021, 12713 : 118 - 130