Multimodal dual perception fusion framework for multimodal affective analysis

被引:0
|
作者
Lu, Qiang [1 ]
Sun, Xia [1 ]
Long, Yunfei [2 ]
Zhao, Xiaodi [1 ]
Zou, Wang [1 ]
Feng, Jun [1 ]
Wang, Xuxin [1 ]
机构
[1] Northwest Univ, Sch Informat Sci & Technol, Xian 710127, Peoples R China
[2] Univ Essex, Sch Comp Sci & Elect Engn, Colchester CO4 3SQ, England
关键词
Multimodal sentiment analysis; Sarcasm detection; Fake news detection; Multimodal affective analysis; Multimodal dual perception fusion;
D O I
10.1016/j.inffus.2024.102747
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The misuse of social platforms and the difficulty in regulating post contents have culminated in a surge of negative sentiments, sarcasms, and the rampant spread of fake news. In response, Multimodal sentiment analysis, sarcasm detection and fake news detection based on image and text have attracted considerable attention recently. Due to that these areas share semantic and sentiment features and confront related fusion challenges in deciphering complex human expressions across different modalities, integrating these multimodal classification tasks that share commonalities across different scenarios into a unified framework is expected to simplify research in sentiment analysis, and enhance the effectiveness of classification tasks involving both semantic and sentiment modeling. Therefore, we consider integral components of a broader spectrum of research known as multimodal affective analysis towards semantics and sentiment, and propose a novel multimodal dual perception fusion framework (MDPF). Specifically, MDPF contains three core procedures: (1) Generating bootstrapping language-image Knowledge to enrich origin modality space, and utilizing cross- modal contrastive learning for aligning text and image modalities to understand underlying semantics and interactions. (2) Designing dynamic connective mechanism to adaptively match image-text pairs and jointly employing gaussian-weighted distribution to intensify semantic sequences. (3) Constructing a cross-modal graph to preserve the structured information of both image and text data and share information between modalities, while introducing sentiment knowledge to refine the edge weights of the graph to capture cross- modal sentiment interaction. We evaluate MDPF on three publicly available datasets across three tasks, and the empirical results demonstrate the superiority of our proposed model.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Towards an intelligent framework for multimodal affective data analysis
    Poria, Soujanya
    Cambria, Erik
    Hussain, Amir
    Huang, Guang-Bin
    NEURAL NETWORKS, 2015, 63 : 104 - 116
  • [2] A review of affective computing: From unimodal analysis to multimodal fusion
    Poria, Soujanya
    Cambria, Erik
    Bajpai, Rajiv
    Hussain, Amir
    INFORMATION FUSION, 2017, 37 : 98 - 125
  • [3] Multimodal Deep Denoise Framework for Affective Video Content Analysis
    Zhu, Yaochen
    Chen, Zhenzhong
    Wu, Feng
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 130 - 138
  • [4] PERCEPTION OF THE PROSODIC FORMATIVE OF MULTIMODAL AFFECTIVE STATES
    Barabanschikov, Vladimir A.
    Suvorova, Ekaterina, V
    Malionok, Arina., V
    EKSPERIMENTALNAYA PSIKHOLOGIYA, 2024, 17 (03): : 30 - 51
  • [5] LLM-Driven Multimodal Fusion for Human Perception Analysis
    Esteban-Romero, Sergio
    Martin-Fernandez, Ivan
    Gil-Martin, Manuel
    Griol-Barres, David
    Callejas-Carrion, Zoraida
    Fernandez-Martinez, Fernando
    PROCEEDINGS OF THE 5TH MULTIMODAL SENTIMENT ANALYSIS CHALLENGE AND WORKSHOP: SOCIAL PERCEPTION AND HUMOR, MUSE 2024, 2024, : 45 - 51
  • [6] A Multimodal Framework for Unsupervised Feature Fusion
    Li, Xiaoyi
    Gao, Jing
    Li, Hui
    Yang, Le
    Srihari, Rohini K.
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 897 - 902
  • [7] Prompt Link Multimodal Fusion in Multimodal Sentiment Analysis
    Zhu, Kang
    Fan, Cunhang
    Tao, Jianhua
    Lv, Zhao
    INTERSPEECH 2024, 2024, : 4668 - 4672
  • [8] Multi-token Fusion Framework for Multimodal Sentiment Analysis
    Long, Zhihui
    Deng, Huan
    Yang, Zhenguo
    Liu, Wenyin
    WEB AND BIG DATA, PT II, APWEB-WAIM 2023, 2024, 14332 : 424 - 438
  • [9] Collaborative Multimodal Fusion Network for Multiagent Perception
    Zhang, Lei
    Wang, Binglu
    Zhao, Yongqiang
    Yuan, Yuan
    Zhou, Tianfei
    Li, Zhijun
    IEEE TRANSACTIONS ON CYBERNETICS, 2025, 55 (01) : 486 - 498
  • [10] Affective video content analysis based on multimodal data fusion in heterogeneous networks
    Guo, Jie
    Song, Bin
    Zhang, Peng
    Ma, Mengdi
    Luo, Wenwen
    Junmei
    INFORMATION FUSION, 2019, 51 : 224 - 232