SPEECH EMOTION RECOGNITION USING AUTOENCODER BOTTLENECK FEATURES AND LSTM

被引:0
|
作者
Huang, Kun-Yi [1 ]
Wu, Chung-Hsien [1 ]
Yang, Tsung-Hsien [1 ]
Su, Ming-Hsiang [1 ]
Chou, Jia-Hui [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan, Taiwan
关键词
Speech emotion recognition; bottleneck features; long-short term memory;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A complete emotional expression contains a complex temporal course in a conversation. Related research on utterance and segment-level processing lacks considering subtle differences in characteristics and historical information. In this work, as Deep Scattering Spectrum (DSS) can obtain more detailed energy distributions in frequency domain than the Low Level Descriptors (LLDs), this work combines LLDs and DSS as the speech features. Autoencoder neural network is then applied to extract the bottleneck features for dimensionality reduction. Finally, the long-short term memory (LSTM) is employed to characterize temporal variation of speech emotion for emotion recognition. For evaluation, the MHMC emotion database was collected and used for performance evaluation. Experimental results show that the proposed method using the bottleneck features from the combination of the LLDs and DSS achieved an emotion recognition accuracy of 98.1%, outperforming the systems using LLDs or DSS individually.
引用
收藏
页码:1 / 4
页数:4
相关论文
共 50 条
  • [31] Learning Salient Features for Speech Emotion Recognition Using CNN
    Liu, Jiamu
    Han, Wenjing
    Ruan, Huabin
    Chen, Xiaomin
    Jiang, Dongmei
    Li, Haifeng
    2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII ASIA), 2018,
  • [32] Automatic speech based emotion recognition using paralinguistics features
    Hook, J.
    Noroozi, F.
    Toygar, O.
    Anbarjafari, G.
    BULLETIN OF THE POLISH ACADEMY OF SCIENCES-TECHNICAL SCIENCES, 2019, 67 (03) : 479 - 488
  • [33] SPEECH EMOTION RECOGNITION USING SELF-SUPERVISED FEATURES
    Morais, Edmilson
    Hoory, Ron
    Zhu, Weizhong
    Gat, Itai
    Damasceno, Matheus
    Aronowitz, Hagai
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6922 - 6926
  • [34] Emotion Recognition from Speech using Prosodic and Linguistic Features
    Pervaiz, Mahwish
    Khan, Tamim Ahmed
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (08) : 84 - 90
  • [35] Automatic speech emotion recognition using modulation spectral features
    Wu, Siqing
    Falk, Tiago H.
    Chan, Wai-Yip
    SPEECH COMMUNICATION, 2011, 53 (05) : 768 - 785
  • [36] Two-stream Emotion-embedded Autoencoder for Speech Emotion Recognition
    Zhang, Chenghao
    Xue, Lei
    2021 IEEE INTERNATIONAL IOT, ELECTRONICS AND MECHATRONICS CONFERENCE (IEMTRONICS), 2021, : 969 - 974
  • [37] ON THE USEFULNESS OF STATISTICAL NORMALISATION OF BOTTLENECK FEATURES FOR SPEECH RECOGNITION
    Loweimi, Erfan
    Bell, Peter
    Renals, Steve
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3862 - 3866
  • [38] DEEP COMPLEMENTARY BOTTLENECK FEATURES FOR VISUAL SPEECH RECOGNITION
    Petridis, Stavros
    Pantic, Maja
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 2304 - 2308
  • [39] Unsupervised Feature Learning for Speech Emotion Recognition Based on Autoencoder
    Ying, Yangwei
    Tu, Yuanwu
    Zhou, Hong
    ELECTRONICS, 2021, 10 (17)
  • [40] Performance Evaluation of Deep Autoencoder Network for Speech Emotion Recognition
    AndleebSiddiqui, Maria
    Hussain, Wajahat
    Ali, Syed Abbas
    Danish-ur-Rehman
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (02) : 606 - 611