EMOCEPTION: AN INCEPTION INSPIRED EFFICIENT SPEECH EMOTION RECOGNITION NETWORK

被引:0
|
作者
Singh, Chirag [1 ]
Kumar, Abhay [1 ]
Nagar, Ajay [1 ]
Tripathi, Suraj [1 ]
Yenigalla, Promod [1 ]
机构
[1] Samsung R&D Inst India, Bangalore, Karnataka, India
关键词
Speech Emotion Recognition; Inception; Multi-Task Learning; CNN;
D O I
10.1109/asru46091.2019.9004020
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This research proposes a Deep Neural Network architecture for Speech Emotion Recognition called Emoception, which takes inspiration from Inception modules. The network takes speech features like Mel-Frequency Spectral Coefficients (MFSC) or Mel-Frequency Cepstral Coefficients (MFCC) as input and recognizes the relevant emotion in the speech. We use USC-IEMOCAP dataset for training but the limited amount of training data and large depth of the network makes the network prone to overfitting, reducing validation accuracy. The Emoception network overcomes this problem by extending in width without increase in computational cost. We also employ a powerful regularization technique, Multi-Task Learning (MTL) to make the network robust. The model using MFSC input with MTL increases the accuracy by 1.6% vis-a-vis Emoception without MTL. We report an overall accuracy improvement of around 4.6% compared to the existing state-of-art methods for four emotion classes on IEMOCAP dataset.
引用
收藏
页码:787 / 791
页数:5
相关论文
共 50 条
  • [1] BIOLOGICALLY INSPIRED SPEECH EMOTION RECOGNITION
    Lotjidereshgi, Reza
    Gournay, Philippe
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5135 - 5139
  • [2] Biologically inspired emotion recognition from speech
    Caponetti, Laura
    Buscicchio, Cosimo Alessandro
    Castellano, Giovanna
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2011,
  • [3] Biologically inspired emotion recognition from speech
    Laura Caponetti
    Cosimo Alessandro Buscicchio
    Giovanna Castellano
    EURASIP Journal on Advances in Signal Processing, 2011
  • [4] PulseEmoNet: Pulse emotion network for speech emotion recognition
    Zhang, Huiyun
    Tang, Gaigai
    Huang, Heming
    Yuan, Zhu
    Li, Zongjin
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 105
  • [5] EmotionEdge: An Efficient Framework for Speech Emotion Recognition
    Wang, Haiyan
    Li, Yitong
    2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
  • [6] Speech Emotion Recognition with Hybrid Neural Network
    Wei, Chuanzheng
    Sun, Xiao
    Tian, Fang
    Ren, Fuji
    5TH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING AND COMMUNICATIONS (BIGCOM 2019), 2019, : 298 - 302
  • [7] Deep scattering network for speech emotion recognition
    Singh, Premjeet
    Saha, Goutam
    Sahidullah, Md
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 131 - 135
  • [8] Graph Isomorphism Network for Speech Emotion Recognition
    Liu, Jiawang
    Wang, Haoxiang
    INTERSPEECH 2021, 2021, : 3405 - 3409
  • [9] Efficient Emotion Recognition based on Hybrid Emotion Recognition Neural Network
    Ou, Yang-Yen
    Su, Bo-Hao
    Tseng, Shih-Pang
    Hsu, Liu-Yi-Cheng
    Wang, Jhing-Fa
    Kuan, Ta-Wen
    2018 INTERNATIONAL CONFERENCE ON ORANGE TECHNOLOGIES (ICOT), 2018,
  • [10] A Hierarchical Classification Scheme for Efficient Speech Emotion Recognition
    Heracleous, Panikos
    Takai, Kohichi
    Yasuda, Keiji
    Yoneyama, Akio
    HCI INTERNATIONAL 2021 - LATE BREAKING POSTERS, HCII 2021, PT II, 2021, 1499 : 88 - 92