Sparse Kernel Reduced-Rank Regression for Bimodal Emotion Recognition From Facial Expression and Speech

被引:74
|
作者
Yan, Jingjie [1 ]
Zheng, Wenming [3 ]
Xu, Qinyu [1 ]
Lu, Guanming [1 ]
Li, Haibo [1 ,2 ]
Wang, Bei [3 ]
机构
[1] Nanjing Univ Posts & Telecommun, Jiangsu Prov Key Lab Image Proc & Image Commun, Coll Telecomm & Informat Engn, Nanjing 210003, Peoples R China
[2] Royal Inst Technol, Sch Comp Sci & Commun, S-11428 Stockholm, Sweden
[3] Southeast Univ, Key Lab Child Dev & Learning Sci, Minist Educ, Res Ctr Learning Sci, Nanjing 210096, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Bimodal emotion recognition; facial expression; feature fusion; sparse kernel reduced-rank regression (SKRRR); speech; PHENOTYPES; FRAMEWORK; FUSION; FACE;
D O I
10.1109/TMM.2016.2557721
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A novel bimodal emotion recognition approach from facial expression and speech based on the sparse kernel reduced-rank regression (SKRRR) fusion method is proposed in this paper. In this method, we use the openSMILE feature extractor and the scale invariant feature transform feature descriptor to respectively extract effective features from speech modality and facial expression modality, and then propose the SKRRR fusion approach to fuse the emotion features of two modalities. The proposed SKRRR method is a nonlinear extension of the traditional reduced-rank regression (RRR), where both predictor and response feature vectors in RRR are kernelized by being mapped onto two high-dimensional feature space via two nonlinear mappings, respectively. To solve the SKRRR problem, we propose a sparse representation (SR)-based approach to find the optimal solution of the coefficient matrices of SKRRR, where the introduction of the SR technique aims to fully consider the different contributions of training data samples to the derivation of optimal solution of SKRRR. Finally, we utilize the eNTERFACE '05 and AFEW4.0 bimodal emotion database to conduct the experiments of monomodal emotion recognition and bimodal emotion recognition, and the results indicate that our presented approach acquires the highest or comparable bimodal emotion recognition rate among some state-of-the-art approaches.
引用
收藏
页码:1319 / 1329
页数:11
相关论文
共 50 条
  • [41] Feature Fusion Algorithm for Multimodal Emotion Recognition from Speech and Facial Expression Signal
    Han Zhiyan
    Wang Jian
    INTERNATIONAL SEMINAR ON APPLIED PHYSICS, OPTOELECTRONICS AND PHOTONICS (APOP 2016), 2016, 61
  • [42] Facial expression recognition based on kernel partial least squares regression
    First Author Research Center for Learning Science, Southeast University, Nanjing 210096, China
    不详
    J. Comput. Inf. Syst., 11 (4281-4289):
  • [43] Multi-expression facial animation based on speech emotion recognition
    Research Center far Pervasive Computing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao, 2008, 4 (520-525):
  • [44] Human emotion recognition by optimally fusing facial expression and speech feature
    Wang, Xusheng
    Chen, Xing
    Cao, Congjun
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2020, 84
  • [45] Training of reduced-rank linear transformations for multi-layer polynomial acoustic features for speech recognition
    Tahir, Muhammad Ali
    Huang, Heyun
    Zeyer, Albert
    Schlueter, Ralf
    Ney, Hermann
    SPEECH COMMUNICATION, 2019, 110 : 56 - 63
  • [46] A Novel Supervised Bimodal Emotion Recognition Approach Based on Facial Expression and Body Gesture
    Yan, Jingjie
    Lu, Guanming
    Bai, Xiaodong
    Li, Haibo
    Sun, Ning
    Liang, Ruiyu
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2018, E101A (11) : 2003 - 2006
  • [47] Facial Expression Analysis for Emotion Recognition Using Kernel Methods and Statistical Models
    Garcia, Hernan F.
    Torres, Cristian A.
    Marin Hurtado, Jorge Ivan
    2014 XIX SYMPOSIUM ON IMAGE, SIGNAL PROCESSING AND ARTIFICIAL VISION (STSIVA), 2014,
  • [48] A Novel Speech Emotion Recognition Method via Incomplete Sparse Least Square Regression
    Zheng, Wenming
    Xin, Minghai
    Wang, Xiaolan
    Wang, Bei
    IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (05) : 569 - 572
  • [49] A Novel Bimodal Emotion Database from Physiological Signals and Facial Expression
    Yan, Jingjie
    Wang, Bei
    Liang, Ruiyu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (07): : 1976 - 1979
  • [50] Multi-Modal Emotion Recognition From Speech and Facial Expression Based on Deep Learning
    Cai, Linqin
    Dong, Jiangong
    Wei, Min
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 5726 - 5729