EMOCEPTION: AN INCEPTION INSPIRED EFFICIENT SPEECH EMOTION RECOGNITION NETWORK

被引:0
|
作者
Singh, Chirag [1 ]
Kumar, Abhay [1 ]
Nagar, Ajay [1 ]
Tripathi, Suraj [1 ]
Yenigalla, Promod [1 ]
机构
[1] Samsung R&D Inst India, Bangalore, Karnataka, India
关键词
Speech Emotion Recognition; Inception; Multi-Task Learning; CNN;
D O I
10.1109/asru46091.2019.9004020
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This research proposes a Deep Neural Network architecture for Speech Emotion Recognition called Emoception, which takes inspiration from Inception modules. The network takes speech features like Mel-Frequency Spectral Coefficients (MFSC) or Mel-Frequency Cepstral Coefficients (MFCC) as input and recognizes the relevant emotion in the speech. We use USC-IEMOCAP dataset for training but the limited amount of training data and large depth of the network makes the network prone to overfitting, reducing validation accuracy. The Emoception network overcomes this problem by extending in width without increase in computational cost. We also employ a powerful regularization technique, Multi-Task Learning (MTL) to make the network robust. The model using MFSC input with MTL increases the accuracy by 1.6% vis-a-vis Emoception without MTL. We report an overall accuracy improvement of around 4.6% compared to the existing state-of-art methods for four emotion classes on IEMOCAP dataset.
引用
收藏
页码:787 / 791
页数:5
相关论文
共 50 条
  • [11] Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition
    Li, Xingfeng
    Shi, Xiaohan
    Hu, Desheng
    Li, Yongwei
    Zhang, Qingchen
    Wang, Zhengxia
    Unoki, Masashi
    Akagi, Masato
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2534 - 2547
  • [12] Speech emotion recognition using the novel PEmoNet (Parallel Emotion Network)
    Bhangale, Kishor B.
    Kothandaraman, Mohanaprasad
    APPLIED ACOUSTICS, 2023, 212
  • [13] Adversarial Data Augmentation Network for Speech Emotion Recognition
    Yi, Lu
    Mak, Man-Wai
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 529 - 534
  • [14] ISNet: Individual Standardization Network for Speech Emotion Recognition
    Fan, Weiquan
    Xu, Xiangmin
    Cai, Bolun
    Xing, Xiaofen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1803 - 1814
  • [15] Application of probabilistic neural network for speech emotion recognition
    Deshmukh S.
    Gupta P.
    International Journal of Speech Technology, 2024, 27 (01) : 19 - 28
  • [16] Design of a Convolutional Neural Network for Speech Emotion Recognition
    Lee, Kyong Hee
    Kim, Do Hyun
    11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 1332 - 1335
  • [17] A neural network approach for human emotion recognition in speech
    Bhatti, MW
    Wang, YJ
    Guan, L
    2004 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL 2, PROCEEDINGS, 2004, : 181 - 184
  • [18] Speech Emotion Recognition Based on Deep Belief Network
    Shi, Peng
    2018 IEEE 15TH INTERNATIONAL CONFERENCE ON NETWORKING, SENSING AND CONTROL (ICNSC), 2018,
  • [19] Comparison of Neural Network Models for Speech Emotion Recognition
    Palo, Hemanta Kumar
    Sagar, Sangeet
    2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND BUSINESS ANALYTICS (ICDSBA 2018), 2018, : 127 - 131
  • [20] STUDY OF DENSE NETWORK APPROACHES FOR SPEECH EMOTION RECOGNITION
    Abdelwahab, Mohammed
    Busso, Carlos
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5084 - 5088