A Multi-Scale Multi-Task Learning Model for Continuous Dimensional Emotion Recognition from Audio

被引：4

作者：

Li, Xia ^{[1
,2
]}

Lu, Guanming ^{[1
]}

Yan, Jingjie ^{[1
]}

Zhang, Zhengyan ^{[1
,3
]}

机构：

[1] Nanjing Univ Posts & Telecommun, Coll Telecommun & Informat Engn, Nanjing 210003, Peoples R China

[2] Anhui Univ Technol, Sch Math & Phys, Maanshan 243000, Peoples R China

[3] Jiangsu Univ Sci & Technol, Sch Elect & Informat, Zhenjiang 212003, Peoples R China

来源：

ELECTRONICS | 2022年 / 11卷 / 03期

基金：

中国国家自然科学基金;

关键词：

continuous dimensional emotion recognition; multi-task learning; deep belief network;

D O I：

10.3390/electronics11030417

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Due to the advantages of many aspects of the dimensional emotion model, continuous dimensional emotion recognition from audio has attracted increasing attention in recent years. Features and dimensional emotion labels on different time scales have different characteristics and contain different information. To make full use of the advantages of features and emotion representations from multiple time scales, a novel multi-scale multi-task (MSMT) learning model is proposed in this paper. The MSMT model is constructed by a deep belief network (DBN) with only one hidden layer. The same hidden layer parameters and linear layer parameters are shared by all features. Multiple temporal pooling operations are inserted between the hidden layer and the linear layer to obtain information on multiple time scales. The mean squared error (MSE) of the main and the secondary task are combined to form the final objective function. Extensive experiments were conducted on RECOLA and SEMAINE datasets to illustrate the effectiveness of our model. The results for the two sets show that even adding a secondary scale to the scale with optimal single-scale single-task performance can achieve significant performance improvements.

引用

页数：16

共 50 条

[1] Speech Emotion Recognition with Multi-task Learning
Cai, Xingyu
Yuan, Jiahong
Zheng, Renjie
Huang, Liang
Church, Kenneth
INTERSPEECH 2021, 2021, : 4508 - 4512
[2] Multi-Task Emotion Recognition Based on Dimensional Model and Category Label
Huo, Yi
Ge, Yun
IEEE ACCESS, 2024, 12 : 75169 - 75179
[3] Multi-task Learning for Speech Emotion and Emotion Intensity Recognition
Yue, Pengcheng
Qu, Leyuan
Zheng, Shukai
Li, Taihao
PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1232 - 1237
[4] Meta Multi-task Learning for Speech Emotion Recognition
Cai, Ruichu
Guo, Kaibin
Xu, Boyan
Yang, Xiaoyan
Zhang, Zhenjie
INTERSPEECH 2020, 2020, : 3336 - 3340
[5] Emotion Recognition With Sequential Multi-task Learning Technique
Phan Tran Dac Thinh
Hoang Manh Hung
Yang, Hyung-Jeong
Kim, Soo-Hyung
Lee, Guee-Sang
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 3586 - 3589
[6] Speech Emotion Recognition based on Multi-Task Learning
Zhao, Huijuan
Han Zhijie
Wang, Ruchuan
2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 186 - 188
[7] EmoComicNet: A multi-task model for comic emotion recognition
Dutta, Arpita
Biswas, Samit
Das, Amit Kumar
PATTERN RECOGNITION, 2024, 150
[8] Multi-Task Learning Model Based on Multi-Scale CNN and LSTM for Sentiment Classification
Jin, Ning
Wu, Jiaxian
Ma, Xiang
Yan, Ke
Mo, Yuchang
IEEE ACCESS, 2020, 8 : 77060 - 77072
[9] Visual-audio emotion recognition based on multi-task and ensemble learning with multiple features
Hao M.
Cao W.-H.
Liu Z.-T.
Wu M.
Xiao P.
Cao, Wei-Hua (weihuacao@cug.edu.cn), 1600, Elsevier B.V., Netherlands (391): : 42 - 51
[10] MATTE: Multi-task multi-scale attention
Strezoski, Gjorgji
van Noord, Nanne
Worring, Marcel
COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 228

← 1 2 3 4 5 →