Sparse Kernel Reduced-Rank Regression for Bimodal Emotion Recognition From Facial Expression and Speech

被引:74
|
作者
Yan, Jingjie [1 ]
Zheng, Wenming [3 ]
Xu, Qinyu [1 ]
Lu, Guanming [1 ]
Li, Haibo [1 ,2 ]
Wang, Bei [3 ]
机构
[1] Nanjing Univ Posts & Telecommun, Jiangsu Prov Key Lab Image Proc & Image Commun, Coll Telecomm & Informat Engn, Nanjing 210003, Peoples R China
[2] Royal Inst Technol, Sch Comp Sci & Commun, S-11428 Stockholm, Sweden
[3] Southeast Univ, Key Lab Child Dev & Learning Sci, Minist Educ, Res Ctr Learning Sci, Nanjing 210096, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Bimodal emotion recognition; facial expression; feature fusion; sparse kernel reduced-rank regression (SKRRR); speech; PHENOTYPES; FRAMEWORK; FUSION; FACE;
D O I
10.1109/TMM.2016.2557721
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A novel bimodal emotion recognition approach from facial expression and speech based on the sparse kernel reduced-rank regression (SKRRR) fusion method is proposed in this paper. In this method, we use the openSMILE feature extractor and the scale invariant feature transform feature descriptor to respectively extract effective features from speech modality and facial expression modality, and then propose the SKRRR fusion approach to fuse the emotion features of two modalities. The proposed SKRRR method is a nonlinear extension of the traditional reduced-rank regression (RRR), where both predictor and response feature vectors in RRR are kernelized by being mapped onto two high-dimensional feature space via two nonlinear mappings, respectively. To solve the SKRRR problem, we propose a sparse representation (SR)-based approach to find the optimal solution of the coefficient matrices of SKRRR, where the introduction of the SR technique aims to fully consider the different contributions of training data samples to the derivation of optimal solution of SKRRR. Finally, we utilize the eNTERFACE '05 and AFEW4.0 bimodal emotion database to conduct the experiments of monomodal emotion recognition and bimodal emotion recognition, and the results indicate that our presented approach acquires the highest or comparable bimodal emotion recognition rate among some state-of-the-art approaches.
引用
收藏
页码:1319 / 1329
页数:11
相关论文
共 50 条
  • [31] Research of emotion recognition based on speech and facial expression
    Wang, Yutai
    Yang, Xinghai
    Zou, Jing
    Telkomnika - Indonesian Journal of Electrical Engineering, 2013, 11 (01): : 83 - 90
  • [32] Speech Emotion Recognition with MPCA and Kernel Partial Least Squares Regression
    Xin, Minghai
    Gu, Weiyi
    Wang, Jinlong
    JOURNAL OF COMPUTERS, 2014, 9 (04) : 998 - 1004
  • [33] Speech Emotion Recognition Based on Kernel Partial Least Squares Regression
    Gu, Weiyi
    2009 THE REGIONAL WORKSHOP OF THE INTERNATIONAL SOCIETY FOR THE STUDY OF BEHAVIOURAL DEVELOPMENT (ISSBD): SOCIAL AND EMOTIONAL DEVELOPMENT IN CHANGING SOCIETIES, 2009, : 93 - 96
  • [34] Multimodal emotion recognition from facial expression and speech based on feature fusion
    Guichen Tang
    Yue Xie
    Ke Li
    Ruiyu Liang
    Li Zhao
    Multimedia Tools and Applications, 2023, 82 : 16359 - 16373
  • [35] Multimodal emotion recognition from facial expression and speech based on feature fusion
    Tang, Guichen
    Xie, Yue
    Li, Ke
    Liang, Ruiyu
    Zhao, Li
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (11) : 16359 - 16373
  • [36] Discovering genetic associations with high-dimensional neuroimaging phenotypes: A sparse reduced-rank regression approach
    Vounou, Maria
    Nichols, Thomas E.
    Montana, Giovanni
    NEUROIMAGE, 2010, 53 (03) : 1147 - 1159
  • [37] Co-sparse reduced-rank regression for association analysis between imaging phenotypes and genetic variants
    Wen, Canhong
    Ba, Hailong
    Pan, Wenliang
    Huang, Meiyan
    BIOINFORMATICS, 2020, 36 (21) : 5214 - 5222
  • [38] Two-way Sparse Reduced-Rank Regression via Scaled Gradient Descent with Hard Thresholding
    Cheng, Cheng
    Zhao, Ziping
    2024 IEEE 13RD SENSOR ARRAY AND MULTICHANNEL SIGNAL PROCESSING WORKSHOP, SAM 2024, 2024,
  • [39] Emotion recognition from the facial image and speech signal
    Go, HJ
    Kwak, KC
    Lee, DJ
    Chun, MG
    SICE 2003 ANNUAL CONFERENCE, VOLS 1-3, 2003, : 2890 - 2895
  • [40] Co-sparse reduced-rank regression for association analysis between imaging phenotypes and genetic variants
    Wen, Canhong
    Ba, Hailong
    Pan, Wenliang
    Huang, Meiyan
    Alzheimers Dis Neuroimaging Initiat
    BIOINFORMATICS, 2021, 36 (21) : 5214 - 5222