Topics Guided Multimodal Fusion Network for Conversational Emotion Recognition

被引:0
|
作者
Yuan, Peicong [1 ]
Cai, Guoyong [1 ]
Chen, Ming [1 ]
Tang, Xiaolv [1 ]
机构
[1] Guilin Univ Elect Technol, Guilin, Peoples R China
关键词
Emotion Recognition in Conversation; Neural Topic Model; Multimodal Fusion;
D O I
10.1007/978-981-97-5669-8_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion Recognition in Conversation (ERC) is a very challenging task. Previous methods capture the semantic dependencies between utterances through complex conversational context modeling, ignoring the impact of the topic information contained in the utterances; furthermore, the commonality of multimodal information has not been effectively explored. To this end, the Topics Guided Multimodal Fusion Network (TGMFN) is proposed to extract effective utterance topic information and explore cross-modal commonality and complementarity to improve model performance. First, the VAE-based neural topic model is used to build a conversational topic model, and a new topic sampling strategy is designed that is different from the traditional reparameterization trick so that the topic modeling is more suitable for utterances. Second, a facial feature extraction method in multi-party conversations is proposed to extract rich facial features in the video. Finally, the Topic-Guided Vision-Audio features Aware fusion (TGV2A) module is designed based on the conversation topic, which fully fuses modal information such as the speaker's facial feature and topic-related co-occurrence information, and captures the commonality and complementarity between multimodal information to improve feature-semantic richness. Extensive experiments have been conducted on two multimodal ERC datasets IEMOCAP and MELD. Experimental results indicate that the proposed TGMFN model shows superior performance over the leading baseline methods.
引用
收藏
页码:250 / 262
页数:13
相关论文
共 50 条
  • [41] Audio-Guided Fusion Techniques for Multimodal Emotion Analysis
    Shi, Pujin
    Gao, Fei
    PROCEEDINGS OF THE 2ND INTERNATIONAL WORKSHOP ON MULTIMODAL AND RESPONSIBLE AFFECTIVE COMPUTING, MRAC 2024, 2024, : 62 - 66
  • [42] Decision-Level Fusion Method for Emotion Recognition using Multimodal Emotion Recognition Information
    Song, Kyu-Seob
    Nho, Young-Hoon
    Seo, Ju-Hwan
    Kwon, Dong-Soo
    2018 15TH INTERNATIONAL CONFERENCE ON UBIQUITOUS ROBOTS (UR), 2018, : 472 - 476
  • [43] MF-Net: a multimodal fusion network for emotion recognition based on multiple physiological signals
    Zhu, Lei
    Ding, Yu
    Huang, Aiai
    Tan, Xufei
    Zhang, Jianhai
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (01)
  • [44] A cross-modal fusion network based on graph feature learning for multimodal emotion recognition
    Cao Xiaopeng
    Zhang Linying
    Chen Qiuxian
    Ning Hailong
    Dong Yizhuo
    The Journal of China Universities of Posts and Telecommunications, 2024, 31 (06) : 16 - 25
  • [45] TDFNet: Transformer-Based Deep-Scale Fusion Network for Multimodal Emotion Recognition
    Zhao, Zhengdao
    Wang, Yuhua
    Shen, Guang
    Xu, Yuezhu
    Zhang, Jiayuan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 3771 - 3782
  • [46] A novel feature fusion network for multimodal emotion recognition from EEG and eye movement signals
    Fu, Baole
    Gu, Chunrui
    Fu, Ming
    Xia, Yuxiao
    Liu, Yinhua
    FRONTIERS IN NEUROSCIENCE, 2023, 17
  • [47] SDR-GNN: Spectral Domain Reconstruction Graph Neural Network for incomplete multimodal learning in conversational emotion recognition
    Fu, Fangze
    Ai, Wei
    Yang, Fan
    Shou, Yuntao
    Meng, Tao
    Li, Keqin
    KNOWLEDGE-BASED SYSTEMS, 2025, 309
  • [48] Multimodal Emotion Recognition based on Global Information Fusion in Conversations
    Kim, Dae Hyeon
    Choi, Young-Seok
    2024 INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS, AND COMMUNICATIONS, ITC-CSCC 2024, 2024,
  • [49] Research on Multimodal Emotion Recognition Based on Fusion of Electroencephalogram and Electrooculography
    Yin, Jialai
    Wu, Minchao
    Yang, Yan
    Li, Ping
    Li, Fan
    Liang, Wen
    Lv, Zhao
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 12
  • [50] Multimodal Local-Global Ranking Fusion for Emotion Recognition
    Liang, Paul Pu
    Zadeh, Amir
    Morency, Louis-Philippe
    ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 472 - 476