Enhancing Emotion Recognition in Conversation Through Emotional Cross-Modal Fusion and Inter-class Contrastive Learning

被引：0

作者：

Shi, Haoxiang ^{[1
,2
]}

Zhang, Xulong ^{[1
]}

Cheng, Ning ^{[1
]}

Zhang, Yong ^{[1
]}

Yu, Jun ^{[2
]}

Xiao, Jing ^{[1
]}

Wang, Jianzong ^{[1
]}

机构：

[1] Ping An Technol Shenzhen Co Ltd, Shenzhen, Peoples R China

[2] Univ Sci & Technol China, Hefei, Peoples R China

来源：

ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024 | 2024年 / 14877卷

关键词：

Emotion recognition; Multi-modal fusion; Contrastive learning;

D O I：

10.1007/978-981-97-5669-8_32

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The purpose of emotion recognition in conversation (ERC) is to identify the emotion category of an utterance based on contextual information. Previous ERC methods relied on simple connections for cross-modal fusion and ignored the information differences between modalities, resulting in the model being unable to focus on modality-specific emotional information. At the same time, the shared information between modalities was not processed to generate emotions. Information redundancy problem. To overcome these limitations, we propose a cross-modal fusion emotion prediction network based on vector connections. The network mainly includes two stages: the multi-modal feature fusion stage based on connection vectors and the emotion classification stage based on fused features. Furthermore, we design a supervised inter-class contrastive learning module based on emotion labels. Experimental results confirm the effectiveness of the proposed method, demonstrating excellent performance on the IEMOCAP and MELD datasets.

引用

页码：391 / 401

页数：11

共 50 条

[1] Cross-modal contrastive learning for multimodal sentiment recognition
Yang, Shanliang
Cui, Lichao
Wang, Lei
Wang, Tao
APPLIED INTELLIGENCE, 2024, 54 (05) : 4260 - 4276
[2] Cross-modal contrastive learning for multimodal sentiment recognition
Shanliang Yang
Lichao Cui
Lei Wang
Tao Wang
Applied Intelligence, 2024, 54 : 4260 - 4276
[3] A Cross-Modal Correlation Fusion Network for Emotion Recognition in Conversations
Tang, Xiaolyu
Cai, Guoyong
Chen, Ming
Yuan, Peicong
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT V, NLPCC 2024, 2025, 15363 : 55 - 68
[4] A cross-modal fusion network based on graph feature learning for multimodal emotion recognition
Cao Xiaopeng
Zhang Linying
Chen Qiuxian
Ning Hailong
Dong Yizhuo
The Journal of China Universities of Posts and Telecommunications, 2024, 31 (06) : 16 - 25
[5] Cross-Modal Dynamic Transfer Learning for Multimodal Emotion Recognition
Hong, Soyeon
Kang, Hyeoungguk
Cho, Hyunsouk
IEEE ACCESS, 2024, 12 : 14324 - 14333
[6] Enhancing Cross-Modal Understanding for Audio Visual Scene-Aware Dialog Through Contrastive Learning
Xu, Feifei
Zhou, Wang
Li, Guangzhen
Zhong, Zheng
Zhou, Yingchen
2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
[7] Elder emotion classification through multimodal fusion of intermediate layers and cross-modal transfer learning
Sreevidya, P.
Veni, S.
Murthy, O. V. Ramana
SIGNAL IMAGE AND VIDEO PROCESSING, 2022, 16 (05) : 1281 - 1288
[8] Elder emotion classification through multimodal fusion of intermediate layers and cross-modal transfer learning
P. Sreevidya
S. Veni
O. V. Ramana Murthy
Signal, Image and Video Processing, 2022, 16 : 1281 - 1288
[9] Gated Multi-modal Fusion with Cross-modal Contrastive Learning for Video Question Answering
Lyu, Chenyang
Li, Wenxi
Ji, Tianbo
Zhou, Liting
Gurrin, Cathal
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII, 2023, 14260 : 427 - 438
[10] SELF-SUPERVISED LEARNING WITH CROSS-MODAL TRANSFORMERS FOR EMOTION RECOGNITION
Khare, Aparna
Parthasarathy, Srinivas
Sundaram, Shiva
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 381 - 388

← 1 2 3 4 5 →