Topics Guided Multimodal Fusion Network for Conversational Emotion Recognition

被引：0

作者：

Yuan, Peicong ^{[1
]}

Cai, Guoyong ^{[1
]}

Chen, Ming ^{[1
]}

Tang, Xiaolv ^{[1
]}

机构：

[1] Guilin Univ Elect Technol, Guilin, Peoples R China

来源：

ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024 | 2024年 / 14877卷

关键词：

Emotion Recognition in Conversation; Neural Topic Model; Multimodal Fusion;

D O I：

10.1007/978-981-97-5669-8_21

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Emotion Recognition in Conversation (ERC) is a very challenging task. Previous methods capture the semantic dependencies between utterances through complex conversational context modeling, ignoring the impact of the topic information contained in the utterances; furthermore, the commonality of multimodal information has not been effectively explored. To this end, the Topics Guided Multimodal Fusion Network (TGMFN) is proposed to extract effective utterance topic information and explore cross-modal commonality and complementarity to improve model performance. First, the VAE-based neural topic model is used to build a conversational topic model, and a new topic sampling strategy is designed that is different from the traditional reparameterization trick so that the topic modeling is more suitable for utterances. Second, a facial feature extraction method in multi-party conversations is proposed to extract rich facial features in the video. Finally, the Topic-Guided Vision-Audio features Aware fusion (TGV2A) module is designed based on the conversation topic, which fully fuses modal information such as the speaker's facial feature and topic-related co-occurrence information, and captures the commonality and complementarity between multimodal information to improve feature-semantic richness. Extensive experiments have been conducted on two multimodal ERC datasets IEMOCAP and MELD. Experimental results indicate that the proposed TGMFN model shows superior performance over the leading baseline methods.

引用

页码：250 / 262

页数：13

共 50 条

[1] PGIF: A Personality-Guided Iterative Feedback Graph Network for Multimodal Conversational Emotion Recognition
Xie, Yunhe
Mao, Rui
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2025,
[2] MALN: Multimodal Adversarial Learning Network for Conversational Emotion Recognition
Ren, Minjie
Huang, Xiangdong
Liu, Jing
Liu, Ming
Li, Xuanya
Liu, An-An
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 6965 - 6980
[3] Revisiting Disentanglement and Fusion on Modality and Context in Conversational Multimodal Emotion Recognition
Li, Bobo
Fei, Hao
Liao, Lizi
Zhao, Yu
Teng, Chong
Chua, Tat-Seng
Ji, Donghong
Li, Fei
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5923 - 5934
[4] Multi-loop graph convolutional network for multimodal conversational emotion recognition
Ren, Minjie
Huang, Xiangdong
Li, Wenhui
Liu, Jing
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 94
[5] Multimodal Emotion Recognition Using a Hierarchical Fusion Convolutional Neural Network
Zhang, Yong
Cheng, Cheng
Zhang, Yidie
IEEE ACCESS, 2021, 9 : 7943 - 7951
[6] Combining Multimodal Features within a Fusion Network for Emotion Recognition in the Wild
Sun, Bo
Li, Liandong
Zhou, Guoyan
Wu, Xuewen
He, Jun
Yu, Lejun
Li, Dongxue
Wei, Qinglan
ICMI'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2015, : 497 - 502
[7] A multi-stage dynamical fusion network for multimodal emotion recognition
Chen, Sihan
Tang, Jiajia
Zhu, Li
Kong, Wanzeng
COGNITIVE NEURODYNAMICS, 2023, 17 (03) : 671 - 680
[8] A multi-stage dynamical fusion network for multimodal emotion recognition
Sihan Chen
Jiajia Tang
Li Zhu
Wanzeng Kong
Cognitive Neurodynamics, 2023, 17 : 671 - 680
[9] FMFN: A Fuzzy Multimodal Fusion Network for Emotion Recognition in Ensemble Conducting
Han, Xiao
Chen, Fuyang
Ban, Junrong
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2025, 33 (01) : 168 - 179
[10] MFGCN: Multimodal fusion graph convolutional network for speech emotion recognition
Qi, Xin
Wen, Yujun
Zhang, Pengzhou
Huang, Heyan
NEUROCOMPUTING, 2025, 611

← 1 2 3 4 5 →