Multimodal emotion recognition based on feature selection and extreme learning machine in video clips

被引：0

作者：

Bei Pan

Kaoru Hirota

Zhiyang Jia

Linhui Zhao

Xiaoming Jin

Yaping Dai

机构：

[1] Beijing Institute of Technology,School of Automation

[2] Beijing Union University,College of Robotics

[3] Beijing Engineering Research Center of Smart Mechanical Innovation Design Service,undefined

来源：

Journal of Ambient Intelligence and Humanized Computing | 2023年 / 14卷

关键词：

Emotion recognition; Multimodal fusion; Evolutionary optimization; Feature selection; Extreme learning machine;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Multimodal fusion-based emotion recognition has attracted increasing attention in affective computing because different modalities can achieve information complementation. One of the main challenges for reliable and effective model design is to define and extract appropriate emotional features from different modalities. In this paper, we present a novel multimodal emotion recognition framework to estimate categorical emotions, where visual and audio signals are utilized as multimodal input. The model learns neural appearance and key emotion frame using a statistical geometric method, which acts as a pre-processer for saving computation power. Discriminative emotion features expressed from visual and audio modalities are extracted through evolutionary optimization, and then fed to the optimized extreme learning machine (ELM) classifiers for unimodal emotion recognition. Finally, a decision-level fusion strategy is applied to integrate the results of predicted emotions by the different classifiers to enhance the overall performance. The effectiveness of the proposed method is demonstrated through three public datasets, i.e., the acted CK+ dataset, the acted Enterface05 dataset, and the spontaneous BAUM-1s dataset. An average recognition rate of 93.53%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document} on CK+, 91.62%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document} on Enterface05, and 60.77%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document} on BAUM-1s are obtained. The emotion recognition results acquired by fusing visual and audio predicted emotions are superior to both recognition of unimodality and concatenation of individual features.

引用

页码：1903 / 1917

页数：14

共 50 条

[1] Multimodal emotion recognition based on feature selection and extreme learning machine in video clips
Pan, Bei
Hirota, Kaoru
Jia, Zhiyang
Zhao, Linhui
Jin, Xiaoming
Dai, Yaping
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 14 (3) : 1903 - 1917
[2] Speech emotion recognition based on feature selection and extreme learning machine decision tree
Liu, Zhen-Tao
Wu, Min
Cao, Wei-Hua
Mao, Jun-Wei
Xu, Jian-Ping
Tan, Guan-Zheng
NEUROCOMPUTING, 2018, 273 : 271 - 280
[3] Multi-Feature Based Emotion Recognition for Video Clips
Liu, Chuanhe
Tang, Tianhao
Lv, Kui
Wang, Minghao
ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 630 - 634
[4] A FEATURE FUSION METHOD BASED ON EXTREME LEARNING MACHINE FOR SPEECH EMOTION RECOGNITION
Guo, Lili
Wang, Longbiao
Dang, Jianwu
Zhang, Linjuan
Guan, Haotian
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2666 - 2670
[5] Feature Selection Based on Extreme Learning Machine
Wang, Zhaoxi
Zhao, Meng
Chen, Shengyong
ICDLT 2019: 2019 3RD INTERNATIONAL CONFERENCE ON DEEP LEARNING TECHNOLOGIES, 2019, : 57 - 63
[6] Multimodal emotion recognition based on peak frame selection from video
Zhalehpour, Sara
Akhtar, Zahid
Erdem, Cigdem Eroglu
SIGNAL IMAGE AND VIDEO PROCESSING, 2016, 10 (05) : 827 - 834
[7] Speech emotion recognition using multimodal feature fusion with machine learning approach
Sandeep Kumar Panda
Ajay Kumar Jena
Mohit Ranjan Panda
Susmita Panda
Multimedia Tools and Applications, 2023, 82 : 42763 - 42781
[8] Multimodal emotion recognition based on peak frame selection from video
Sara Zhalehpour
Zahid Akhtar
Cigdem Eroglu Erdem
Signal, Image and Video Processing, 2016, 10 : 827 - 834
[9] Speech emotion recognition using multimodal feature fusion with machine learning approach
Panda, Sandeep Kumar
Jena, Ajay Kumar
Panda, Mohit Ranjan
Panda, Susmita
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (27) : 42763 - 42781
[10] Hierarchical extreme puzzle learning machine-based emotion recognition using multimodal physiological signals
Pradhan, Anushka
Srivastava, Subodh
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 83

← 1 2 3 4 5 →