Multimodal dual perception fusion framework for multimodal affective analysis

被引:0
|
作者
Lu, Qiang [1 ]
Sun, Xia [1 ]
Long, Yunfei [2 ]
Zhao, Xiaodi [1 ]
Zou, Wang [1 ]
Feng, Jun [1 ]
Wang, Xuxin [1 ]
机构
[1] Northwest Univ, Sch Informat Sci & Technol, Xian 710127, Peoples R China
[2] Univ Essex, Sch Comp Sci & Elect Engn, Colchester CO4 3SQ, England
关键词
Multimodal sentiment analysis; Sarcasm detection; Fake news detection; Multimodal affective analysis; Multimodal dual perception fusion;
D O I
10.1016/j.inffus.2024.102747
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The misuse of social platforms and the difficulty in regulating post contents have culminated in a surge of negative sentiments, sarcasms, and the rampant spread of fake news. In response, Multimodal sentiment analysis, sarcasm detection and fake news detection based on image and text have attracted considerable attention recently. Due to that these areas share semantic and sentiment features and confront related fusion challenges in deciphering complex human expressions across different modalities, integrating these multimodal classification tasks that share commonalities across different scenarios into a unified framework is expected to simplify research in sentiment analysis, and enhance the effectiveness of classification tasks involving both semantic and sentiment modeling. Therefore, we consider integral components of a broader spectrum of research known as multimodal affective analysis towards semantics and sentiment, and propose a novel multimodal dual perception fusion framework (MDPF). Specifically, MDPF contains three core procedures: (1) Generating bootstrapping language-image Knowledge to enrich origin modality space, and utilizing cross- modal contrastive learning for aligning text and image modalities to understand underlying semantics and interactions. (2) Designing dynamic connective mechanism to adaptively match image-text pairs and jointly employing gaussian-weighted distribution to intensify semantic sequences. (3) Constructing a cross-modal graph to preserve the structured information of both image and text data and share information between modalities, while introducing sentiment knowledge to refine the edge weights of the graph to capture cross- modal sentiment interaction. We evaluate MDPF on three publicly available datasets across three tasks, and the empirical results demonstrate the superiority of our proposed model.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] MAG: a smart gloves system based on multimodal fusion perception
    Cui, Hong
    Feng, Zhiquan
    Tian, Jinglan
    Kong, Dehui
    Xia, Zishuo
    Li, Weina
    CCF TRANSACTIONS ON PERVASIVE COMPUTING AND INTERACTION, 2023, 5 (04) : 411 - 429
  • [42] Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis
    Han, Wei
    Chen, Hui
    Poria, Soujanya
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9180 - 9192
  • [43] On a Combined Analysis Framework for Multimodal Discourse Analysis
    窦瑞芳
    校园英语, 2015, (26) : 208 - 209
  • [44] Triple disentangled representation learning for multimodal affective analysis
    Zhou, Ying
    Liang, Xuefeng
    Chen, Han
    Zhao, Yin
    Chen, Xin
    Yu, Lida
    INFORMATION FUSION, 2024, 114
  • [45] Disentangled variational auto-encoder for multimodal fusion performance analysis in multimodal sentiment analysis
    Chen, Rongfei
    Zhou, Wenju
    Hu, Huosheng
    Fei, Zixiang
    Fei, Minrui
    Zhou, Hao
    KNOWLEDGE-BASED SYSTEMS, 2024, 301
  • [46] A Multimodal Perception and Cognition Framework and Its Application for Social Robots
    Dong, Lanfang
    Hu, PuZhao
    Xiao, Xiao
    Tang, YingChao
    Mao, Meng
    Li, Guoming
    SOCIAL ROBOTICS, ICSR 2022, PT I, 2022, 13817 : 475 - 484
  • [47] Dual-Perspective Fusion Network for Aspect-Based Multimodal Sentiment Analysis
    Wang, Di
    Tian, Changning
    Liang, Xiao
    Zhao, Lin
    He, Lihuo
    Wang, Quan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 (4028-4038) : 4028 - 4038
  • [48] DSAF: A Dual-Stage Attention Based Multimodal Fusion Framework for Medical Visual Question Answering
    K. Mukesh
    S. L. Jayaprakash
    R. Prasanna Kumar
    SN Computer Science, 6 (4)
  • [49] Tensor Analysis and Fusion of Multimodal Brain Images
    Karahan, Esin
    Rojas-Lopez, Pedro A.
    Bringas-Vega, Maria L.
    Valdes-Hernandez, Pedro A.
    Valdes-Sosa, Pedro A.
    PROCEEDINGS OF THE IEEE, 2015, 103 (09) : 1531 - 1559
  • [50] Attention fusion network for multimodal sentiment analysis
    Yuanyi Luo
    Rui Wu
    Jiafeng Liu
    Xianglong Tang
    Multimedia Tools and Applications, 2024, 83 : 8207 - 8217