Visual-audio emotion recognition based on multi-task and ensemble learning with multiple features

被引:45
|
作者
Hao M. [1 ,2 ]
Cao W.-H. [1 ,2 ]
Liu Z.-T. [1 ,2 ]
Wu M. [1 ,2 ]
Xiao P. [1 ,2 ]
机构
[1] School of Automation, China University of Geosciences, Wuhan
[2] Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Wuhan
来源
Cao, Wei-Hua (weihuacao@cug.edu.cn) | 1600年 / Elsevier B.V., Netherlands卷 / 391期
基金
中国国家自然科学基金;
关键词
Ensemble learning; Multi-task learning; Multiple features; Visual-audio emotion recognition;
D O I
10.1016/j.neucom.2020.01.048
中图分类号
学科分类号
摘要
An ensemble visual-audio emotion recognition framework is proposed based on multi-task and blending learning with multiple features in this paper. To solve the problem that existing features can not accurately identify different emotions, we extract two kinds features, i. e., Interspeech 2010 and deep features for audio data, LBP and deep features for visual data, with the intent to accurately identify different emotions by using different features. Owing to the diversity of features, SVM classifiers and CNN are designed for manual features, i.e., Interspeech 2010 features and local LBP features, and deep features, through which four sub-models are obtained. Finally, the blending ensemble algorithm is used to fuse sub-models to improve the recognition performance of visual-audio emotion recognition. In addition, multi-task learning is applied in the CNN model for deep features, which can predict multiple tasks at the same time with fewer parameters and improve the sensitivity of the single recognition model to user's emotion by sharing information between different tasks. Experiments are performed using eNTERFACCE database, from which the results indicate that the recognition of multi-task CNN increased by 3% and 2% on average over CNN model in speaker-independent and speaker-dependent experiments, respectively. And emotion recognition accuracy of visual-audio by our method reaches 81.36% and 78.42% in speaker-independent and speaker-dependent experiments, respectively, which maintain higher performance than some state-of-the-art works. © 2020
引用
收藏
页码:42 / 51
页数:9
相关论文
共 50 条
  • [1] Visual -audio emotion recognition based on multi -task and ensemble learning with multiple features ?
    Hao, Man
    Cao, Wei-Hua
    Liu, Zhen-Tao
    Wu, Min
    Xiao, Peng
    NEUROCOMPUTING, 2020, 391 : 42 - 51
  • [2] Speech Emotion Recognition based on Multi-Task Learning
    Zhao, Huijuan
    Han Zhijie
    Wang, Ruchuan
    2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 186 - 188
  • [3] Audio-Visual Group-based Emotion Recognition using Local and Global Feature Aggregation based Multi-Task Learning
    Li, Sunan
    Lian, Hailun
    Lu, Cheng
    Zhao, Yan
    Tang, Chuangao
    Zong, Yuan
    Zheng, Wenming
    PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2023, 2023, : 741 - 745
  • [4] Multi-Task Ensemble Learning for Affect Recognition
    Gjoreski, Martin
    Lustrek, Mitja
    Gams, Matjaz
    PROCEEDINGS OF THE 2018 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING AND PROCEEDINGS OF THE 2018 ACM INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTERS (UBICOMP/ISWC'18 ADJUNCT), 2018, : 553 - 558
  • [5] Speech Emotion Recognition with Multi-task Learning
    Cai, Xingyu
    Yuan, Jiahong
    Zheng, Renjie
    Huang, Liang
    Church, Kenneth
    INTERSPEECH 2021, 2021, : 4508 - 4512
  • [6] Emotion recognition in conversations with emotion shift detection based on multi-task learning
    Gao, Qingqing
    Cao, Biwei
    Guan, Xin
    Gu, Tianyun
    Bao, Xing
    Wu, Junyan
    Liu, Bo
    Cao, Jiuxin
    KNOWLEDGE-BASED SYSTEMS, 2022, 248
  • [7] Multi-task Learning for Speech Emotion and Emotion Intensity Recognition
    Yue, Pengcheng
    Qu, Leyuan
    Zheng, Shukai
    Li, Taihao
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1232 - 1237
  • [8] A Multi-Scale Multi-Task Learning Model for Continuous Dimensional Emotion Recognition from Audio
    Li, Xia
    Lu, Guanming
    Yan, Jingjie
    Zhang, Zhengyan
    ELECTRONICS, 2022, 11 (03)
  • [9] Inconsistency-Based Multi-Task Cooperative Learning for Emotion Recognition
    Xu, Yifan
    Cui, Yuqi
    Jiang, Xue
    Yin, Yingjie
    Ding, Jingting
    Li, Liang
    Wu, Dongrui
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (04) : 2017 - 2027
  • [10] Meta Multi-task Learning for Speech Emotion Recognition
    Cai, Ruichu
    Guo, Kaibin
    Xu, Boyan
    Yang, Xiaoyan
    Zhang, Zhenjie
    INTERSPEECH 2020, 2020, : 3336 - 3340