Design of Efficient Speech Emotion Recognition Based on Multi Task Learning

被引：8

作者：

Liu, Yunxiang ^{[1
]}

Zhang, Kexin ^{[1
]}

机构：

[1] Shanghai Inst Technol, Dept Comp Sci, Shanghai 201418, Peoples R China

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Task analysis; Multitasking; Emotion recognition; Feature extraction; Noise measurement; Speech recognition; Decoding; Speech emotion recognition; multi-task learning; noise reduction; eliminating gender differences; hidden layer sharing; data balance; specific task classification processing; CLASSIFICATION; FEATURES; CORPUS;

D O I：

10.1109/ACCESS.2023.3237268

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Speech emotion recognition technology includes feature extraction and classifier construction. However, the recognition efficiency is reduced due to noise interference and gender differences. To solve this problem, this paper used two multi-task learning models based on adversarial multi-task learning(ASP-MTL). The first model took emotion recognition as the main task and noise recognition as the auxiliary task, and removed the noise part identified by the auxiliary task. After identifying the non-noise part, the second model was constructed. The second model took emotion recognition as the main task and gender classification as the auxiliary task. These two multi-task learning models can not only can use shared information to learn the relationship between different tasks, but also can identify specific tasks. This paper used Audio/Visual Emotion Challenge (AVEC) database and AFEW6.0 database,which were recorded in the field environment. Considering the problem of data imbalance between datasets, the data balance operation was carried out on the data sets in the process of data preprocessing. The paper shows an increase of around 10% in terms of accuracy and F1 score with the recent works using AVEC database and AFEW6.0 datasets, which proved that this paper has made a great progress in SER.

引用

页码：5528 / 5537

页数：10

共 50 条

[1] Speech Emotion Recognition based on Multi-Task Learning
Zhao, Huijuan
Han Zhijie
Wang, Ruchuan
2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 186 - 188
[2] Speech Emotion Recognition with Multi-task Learning
Cai, Xingyu
Yuan, Jiahong
Zheng, Renjie
Huang, Liang
Church, Kenneth
INTERSPEECH 2021, 2021, : 4508 - 4512
[3] Multi-task Learning for Speech Emotion and Emotion Intensity Recognition
Yue, Pengcheng
Qu, Leyuan
Zheng, Shukai
Li, Taihao
PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1232 - 1237
[4] Meta Multi-task Learning for Speech Emotion Recognition
Cai, Ruichu
Guo, Kaibin
Xu, Boyan
Yang, Xiaoyan
Zhang, Zhenjie
INTERSPEECH 2020, 2020, : 3336 - 3340
[5] Coarse-to-Fine Speech Emotion Recognition Based on Multi-Task Learning
Zhao Huijuan
Ye Ning
Wang Ruchuan
Journal of Signal Processing Systems, 2021, 93 : 299 - 308
[6] Coarse-to-Fine Speech Emotion Recognition Based on Multi-Task Learning
Zhao, Huijuan
Ye, Ning
Wang, Ruchuan
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2021, 93 (2-3): : 299 - 308
[7] MMER: Multimodal Multi-task Learning for Speech Emotion Recognition
Ghosh, Sreyan
Tyagi, Utkarsh
Ramaneswaran, S.
Srivastava, Harshvardhan
Manocha, Dinesh
INTERSPEECH 2023, 2023, : 1209 - 1213
[8] Speech Emotion Recognition using Decomposed Speech via Multi-task Learning
Hsu, Jia-Hao
Wu, Chung-Hsien
Wei, Yu-Hung
INTERSPEECH 2023, 2023, : 4553 - 4557
[9] Speech Emotion Recognition Based on Multi-Task Learning Using a Convolutional Neural Network
Kim, Nam Kyun
Lee, Jiwon
Ha, Hun Kyu
Lee, Geon Woo
Lee, Jung Hyuk
Kim, Hong Kook
2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 704 - 707
[10] Speech Emotion Recognition in the Wild using Multi-task and Adversarial Learning
Parry, Jack
DeMattos, Eric
Klementiev, Anita
Ind, Axel
Morse-Kopp, Daniela
Clarke, Georgia
Palaz, Dimitri
INTERSPEECH 2022, 2022, : 1158 - 1162

← 1 2 3 4 5 →