EMOCEPTION: AN INCEPTION INSPIRED EFFICIENT SPEECH EMOTION RECOGNITION NETWORK

被引:0
|
作者
Singh, Chirag [1 ]
Kumar, Abhay [1 ]
Nagar, Ajay [1 ]
Tripathi, Suraj [1 ]
Yenigalla, Promod [1 ]
机构
[1] Samsung R&D Inst India, Bangalore, Karnataka, India
关键词
Speech Emotion Recognition; Inception; Multi-Task Learning; CNN;
D O I
10.1109/asru46091.2019.9004020
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This research proposes a Deep Neural Network architecture for Speech Emotion Recognition called Emoception, which takes inspiration from Inception modules. The network takes speech features like Mel-Frequency Spectral Coefficients (MFSC) or Mel-Frequency Cepstral Coefficients (MFCC) as input and recognizes the relevant emotion in the speech. We use USC-IEMOCAP dataset for training but the limited amount of training data and large depth of the network makes the network prone to overfitting, reducing validation accuracy. The Emoception network overcomes this problem by extending in width without increase in computational cost. We also employ a powerful regularization technique, Multi-Task Learning (MTL) to make the network robust. The model using MFSC input with MTL increases the accuracy by 1.6% vis-a-vis Emoception without MTL. We report an overall accuracy improvement of around 4.6% compared to the existing state-of-art methods for four emotion classes on IEMOCAP dataset.
引用
收藏
页码:787 / 791
页数:5
相关论文
共 50 条
  • [41] Bidirectional parallel echo state network for speech emotion recognition
    Ibrahim, Hemin
    Loo, Chu Kiong
    Alnajjar, Fady
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (20): : 17581 - 17599
  • [42] Speech Emotion Recognition Based on Deep Residual Shrinkage Network
    Han, Tian
    Zhang, Zhu
    Ren, Mingyuan
    Dong, Changchun
    Jiang, Xiaolin
    Zhuang, Quansheng
    ELECTRONICS, 2023, 12 (11)
  • [43] A Joint Network Based on Interactive Attention for Speech Emotion Recognition
    Hu, Ying
    Hou, Shijing
    Yang, Huamin
    Huang, Hao
    He, Liang
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1715 - 1720
  • [44] A Study on Speech Emotion Recognition Using a Deep Neural Network
    Lee, Kyong Hee
    Choi, Hyun Kyun
    Jang, Byung Tae
    Kim, Do Hyun
    2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC): ICT CONVERGENCE LEADING THE AUTONOMOUS FUTURE, 2019, : 1162 - 1165
  • [45] Improving Speech Emotion Recognition With Adversarial Data Augmentation Network
    Yi, Lu
    Mak, Man-Wai
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (01) : 172 - 184
  • [46] Performance Evaluation of Deep Autoencoder Network for Speech Emotion Recognition
    AndleebSiddiqui, Maria
    Hussain, Wajahat
    Ali, Syed Abbas
    Danish-ur-Rehman
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (02) : 606 - 611
  • [47] Speech Emotion Recognition using MFCC features and LSTM network
    Kumbhar, Harshawardhan S.
    Bhandari, Sheetal U.
    2019 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, CONTROL AND AUTOMATION (ICCUBEA), 2019,
  • [48] Transfer Learning of Deep Neural Network for Speech Emotion Recognition
    Huang, Ying
    Hu, Mingqing
    Yu, Xianguo
    Wang, Tao
    Yang, Chen
    PATTERN RECOGNITION (CCPR 2016), PT II, 2016, 663 : 721 - 729
  • [49] Bidirectional parallel echo state network for speech emotion recognition
    Hemin Ibrahim
    Chu Kiong Loo
    Fady Alnajjar
    Neural Computing and Applications, 2022, 34 : 17581 - 17599
  • [50] Speech Emotion Recognition based on Interactive Convolutional Neural Network
    Cheng, Huihui
    Tang, Xiaoyu
    2020 IEEE 3RD INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP 2020), 2020, : 163 - 167