Visual-audio emotion recognition based on multi-task and ensemble learning with multiple features

被引：45

作者：

Hao M. ^{[1
,2
]}

Cao W.-H. ^{[1
,2
]}

Liu Z.-T. ^{[1
,2
]}

Wu M. ^{[1
,2
]}

Xiao P. ^{[1
,2
]}

机构：

[1] School of Automation, China University of Geosciences, Wuhan

[2] Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Wuhan

来源：

Cao, Wei-Hua (weihuacao@cug.edu.cn) | 1600年 / Elsevier B.V., Netherlands卷 / 391期

基金：

中国国家自然科学基金;

关键词：

Ensemble learning; Multi-task learning; Multiple features; Visual-audio emotion recognition;

D O I：

10.1016/j.neucom.2020.01.048

中图分类号：

学科分类号：

摘要：

An ensemble visual-audio emotion recognition framework is proposed based on multi-task and blending learning with multiple features in this paper. To solve the problem that existing features can not accurately identify different emotions, we extract two kinds features, i. e., Interspeech 2010 and deep features for audio data, LBP and deep features for visual data, with the intent to accurately identify different emotions by using different features. Owing to the diversity of features, SVM classifiers and CNN are designed for manual features, i.e., Interspeech 2010 features and local LBP features, and deep features, through which four sub-models are obtained. Finally, the blending ensemble algorithm is used to fuse sub-models to improve the recognition performance of visual-audio emotion recognition. In addition, multi-task learning is applied in the CNN model for deep features, which can predict multiple tasks at the same time with fewer parameters and improve the sensitivity of the single recognition model to user's emotion by sharing information between different tasks. Experiments are performed using eNTERFACCE database, from which the results indicate that the recognition of multi-task CNN increased by 3% and 2% on average over CNN model in speaker-independent and speaker-dependent experiments, respectively. And emotion recognition accuracy of visual-audio by our method reaches 81.36% and 78.42% in speaker-independent and speaker-dependent experiments, respectively, which maintain higher performance than some state-of-the-art works. © 2020

引用

页码：42 / 51

页数：9

共 50 条

[1] Visual -audio emotion recognition based on multi -task and ensemble learning with multiple features ?
Hao, Man
Cao, Wei-Hua
Liu, Zhen-Tao
Wu, Min
Xiao, Peng
NEUROCOMPUTING, 2020, 391 : 42 - 51
[2] Speech Emotion Recognition based on Multi-Task Learning
Zhao, Huijuan
Han Zhijie
Wang, Ruchuan
2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 186 - 188
[3] Audio-Visual Group-based Emotion Recognition using Local and Global Feature Aggregation based Multi-Task Learning
Li, Sunan
Lian, Hailun
Lu, Cheng
Zhao, Yan
Tang, Chuangao
Zong, Yuan
Zheng, Wenming
PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2023, 2023, : 741 - 745
[4] Multi-Task Ensemble Learning for Affect Recognition
Gjoreski, Martin
Lustrek, Mitja
Gams, Matjaz
PROCEEDINGS OF THE 2018 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING AND PROCEEDINGS OF THE 2018 ACM INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTERS (UBICOMP/ISWC'18 ADJUNCT), 2018, : 553 - 558
[5] Speech Emotion Recognition with Multi-task Learning
Cai, Xingyu
Yuan, Jiahong
Zheng, Renjie
Huang, Liang
Church, Kenneth
INTERSPEECH 2021, 2021, : 4508 - 4512
[6] Emotion recognition in conversations with emotion shift detection based on multi-task learning
Gao, Qingqing
Cao, Biwei
Guan, Xin
Gu, Tianyun
Bao, Xing
Wu, Junyan
Liu, Bo
Cao, Jiuxin
KNOWLEDGE-BASED SYSTEMS, 2022, 248
[7] Multi-task Learning for Speech Emotion and Emotion Intensity Recognition
Yue, Pengcheng
Qu, Leyuan
Zheng, Shukai
Li, Taihao
PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1232 - 1237
[8] A Multi-Scale Multi-Task Learning Model for Continuous Dimensional Emotion Recognition from Audio
Li, Xia
Lu, Guanming
Yan, Jingjie
Zhang, Zhengyan
ELECTRONICS, 2022, 11 (03)
[9] Inconsistency-Based Multi-Task Cooperative Learning for Emotion Recognition
Xu, Yifan
Cui, Yuqi
Jiang, Xue
Yin, Yingjie
Ding, Jingting
Li, Liang
Wu, Dongrui
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (04) : 2017 - 2027
[10] Meta Multi-task Learning for Speech Emotion Recognition
Cai, Ruichu
Guo, Kaibin
Xu, Boyan
Yang, Xiaoyan
Zhang, Zhenjie
INTERSPEECH 2020, 2020, : 3336 - 3340

← 1 2 3 4 5 →