PREDICTION-BASED LEARNING FOR CONTINUOUS EMOTION RECOGNITION IN SPEECH

被引:0
|
作者
Han, Jing [1 ]
Zhang, Zixing [1 ]
Ringeval, Fabien [2 ]
Schuller, Bjorn [1 ,3 ]
机构
[1] Univ Passau, Chair Complex Intelligent Syst, Passau, Germany
[2] Univ Grenoble Alpes, Lab Informat Grenoble, Grenoble, France
[3] Imperial Coll London, Dept Comp, London, England
基金
欧盟第七框架计划;
关键词
Affective computing; hierarchical regression models; support vector regression; long short-term memory; ALGORITHM;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, a prediction-based learning framework is proposed for a continuous prediction task of emotion recognition from speech, which is one of the key components of affective computing in multimedia. The main goal of this framework is to utmost exploit the individual advantages of different regression models cooperatively. To this end, we take two widely used regression models for example, i.e., support vector regression and bidirectional long short-term memory recurrent neural network. We concatenate the two models in a tandem structure by different ways, forming a united cascaded framework. The outputs predicted by the former model are combined together with the original features as the input of the following model for final predictions. The experimental results on a time-and value-continuous spontaneous emotion database (RECOLA) show that, the prediction-based learning framework significantly outperforms the individual models for both arousal and valence dimensions, and provides significantly better results in comparison to other state-of-the-art methodologies on this corpus.
引用
收藏
页码:5005 / 5009
页数:5
相关论文
共 50 条
  • [1] RECONSTRUCTION-ERROR-BASED LEARNING FOR CONTINUOUS EMOTION RECOGNITION IN SPEECH
    Han, Jing
    Zhang, Zixing
    Ringeval, Fabien
    Schuller, Bjoern
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2367 - 2371
  • [2] Speech Emotion Recognition Based on Learning Automata in
    Motamed, Sara
    Setayeshi, Saeed
    Farhoudi, Zeinab
    Ahmadi, Ali
    JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE-JMCS, 2014, 12 (03): : 173 - 185
  • [3] Continuous Wavelet Transform based Speech Emotion Recognition
    Shegokar, Pankaj
    Sircar, Pradip
    2016 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS), 2016,
  • [4] Speech emotion recognition based on an improved brain emotion learning model
    Liu, Zhen-Tao
    Xie, Qiao
    Wu, Min
    Cao, Wei-Hua
    Mei, Ying
    Mao, Jun-Wei
    NEUROCOMPUTING, 2018, 309 : 145 - 156
  • [5] Speech based Emotion Recognition using Machine Learning
    Deshmukh, Girija
    Gaonkar, Apurva
    Golwalkar, Gauri
    Kulkarni, Sukanya
    PROCEEDINGS OF THE 2019 3RD INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2019), 2019, : 812 - 817
  • [6] Unsupervised Representation Learning with Future Observation Prediction for Speech Emotion Recognition
    Lian, Zheng
    Tao, Jianhua
    Liu, Bin
    Huang, Jian
    INTERSPEECH 2019, 2019, : 3840 - 3844
  • [7] Speech-based Emotion Recognition and Next Reaction Prediction
    Noroozi, Fatemeh
    Akrami, Neda
    Anbarjafari, Gholamreza
    2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
  • [8] Stream-based Active Learning for Speech Emotion Recognition via Hybrid Data Selection and Continuous Learning
    Moreno-Acevedo, Santiago A.
    Vasquez-Correa, Juan Camilo
    Martin-Donas, Juan M.
    Alvarez, Aitor
    TEXT, SPEECH, AND DIALOGUE, TSD 2024, PT II, 2024, 15049 : 105 - 117
  • [9] Representation Learning for Speech Emotion Recognition
    Ghosh, Sayan
    Laksana, Eugene
    Morency, Louis-Philippe
    Scherer, Stefan
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3603 - 3607
  • [10] Speech Emotion Recognition with Deep Learning
    Harar, Pavol
    Burget, Radim
    Dutta, Malay Kishore
    2017 4TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2017, : 137 - 140