PGIF: A Personality-Guided Iterative Feedback Graph Network for Multimodal Conversational Emotion Recognition

被引:0
|
作者
Xie, Yunhe [1 ]
Mao, Rui [2 ]
机构
[1] Harbin Inst Technol, Fac Comp, Harbin 150001, Peoples R China
[2] Nanyang Technol Univ, Coll Comp & Data Sci, Singapore 639798, Singapore
关键词
Emotion recognition; Iterative methods; Pragmatics; Feature extraction; Vectors; Semantics; Oral communication; Long short term memory; Correlation; Context modeling; interlocutor-induced pragmatic variation; iterative feedback fusion mechanism; multimodal conversational emotion recognition (MCER);
D O I
10.1109/TCSS.2024.3523322
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Multimodal emotion recognition in conversation (MERC) aims to identify emotions in target utterances from multimodal records, drawing significant attention for its value in conversational artificial intelligence. While early research focuses on exploring conversational context, recent efforts emphasize integrating multimodal cues. Existing methods focus on modeling the impact of conversational context on emotion recognition while neglecting the role of the speaker's personality factors. Furthermore, these approaches often suffer from inefficiencies in information transfer due to full-utterance connectivity and fail to leverage multiple fusion modes for complementary benefits. To address these issues, we propose a personality-guided iterative feedback graph network (PGIF) for MERC. PGIF incorporates personality information as a specialized modality to enhance the feature space for emotional inference. We utilize a graph network to model information flow, integrating interlocutor-aware contextual information by considering interlocutor dependencies between utterances. Additionally, we employ a dialogue discourse parser to directly model semantic relationships between utterances. Our iterative feedback fusion mechanism explicitly simulates emotional interactions between feature-level and decision-level modalities, improving inference without ground truth labels through iterative refinement. PGIF demonstrates improvements of 1.94% and 1.42% over state-of-the-art methods on the IEMOCAP and MELD datasets, respectively. Ablation studies further validate the effectiveness of PGIF's mechanisms, while its manipulation of input features and global fusion strategies ensures compatibility with existing approaches.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] A Persona-Infused Cross-Task Graph Network for Multimodal Emotion Recognition with Emotion Shift Detection in Conversations
    Tu, Geng
    Xiong, Feng
    Liang, Bin
    Xu, Ruifeng
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2266 - 2270
  • [32] Quantum-inspired Neural Network for Conversational Emotion Recognition
    Li, Qiuchi
    Gkoumas, Dimitris
    Sordoni, Alessandro
    Nie, Jian-Yun
    Melucci, Massimo
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 13270 - 13278
  • [33] Graph to Grid: Learning Deep Representations for Multimodal Emotion Recognition
    Jin, Ming
    Li, Jinpeng
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5985 - 5993
  • [34] Multimodal Emotion Recognition Using Compressed Graph Neural Networks
    Durkic, Tijana
    Simic, Nikola
    Bajovic, Sinisa Suzie Dragana
    Peric, Zoran
    Delic, Vladan
    SPEECH AND COMPUTER, SPECOM 2024, PT II, 2025, 15300 : 109 - 121
  • [35] Graph Isomorphism Network for Speech Emotion Recognition
    Liu, Jiawang
    Wang, Haoxiang
    INTERSPEECH 2021, 2021, : 3405 - 3409
  • [36] Emotion Recognition using Multimodal Residual LSTM Network
    Ma, Jiaxin
    Tang, Hao
    Zheng, Wei-Long
    Lu, Bao-Liang
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 176 - 183
  • [37] Dual-level constraint based distributed graph convolution network for multimodal emotion recognition in conversation
    Xiang, Yan
    Wang, Lu
    Tan, Xiaocong
    Guo, Junjun
    NEUROCOMPUTING, 2025, 618
  • [38] Length Uncertainty-Aware Graph Contrastive Fusion Network for multimodal physiological signal emotion recognition
    Li, Guangqiang
    Chen, Ning
    Zhu, Hongqing
    Li, Jing
    Xu, Zhangyong
    Zhu, Zhiying
    NEURAL NETWORKS, 2025, 187
  • [39] A Contextual Attention Network for Multimodal Emotion Recognition in Conversation
    Wang, Tana
    Hou, Yaqing
    Zhou, Dongsheng
    Zhang, Qiang
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [40] Interactive Multimodal Attention Network for Emotion Recognition in Conversation
    Ren, Minjie
    Huang, Xiangdong
    Shi, Xiaoqi
    Nie, Weizhi
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1046 - 1050