Transformer-based correlation mining network with self-supervised label generation for multimodal sentiment analysis

被引:1
|
作者
Wang, Ruiqing [1 ]
Yang, Qimeng [1 ]
Tian, Shengwei [1 ]
Yu, Long [2 ]
He, Xiaoyu [3 ]
Wang, Bo [1 ]
机构
[1] Xinjiang Univ, Sch Software, Urumqi, Xinjiang, Peoples R China
[2] Xinjiang Univ, Network & Informat Ctr, Network, Xinjiang, Peoples R China
[3] Xinjiang Univ, Coll Informat Sci & Engn, Urumqi 830000, Peoples R China
基金
中国国家自然科学基金;
关键词
Multimodal sentiment analysis; Transformer; Multimodal fusion; Collaborative learning; FUSION;
D O I
10.1016/j.neucom.2024.129163
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal Sentiment Analysis (MSA) aims to recognize and understand a speaker's sentiment state by integrating information from natural language, facial expressions, and voice, has gained much attention in recent years. However, modeling multimodal data poses two main challenges: 1) There are potential sentiment correlations between modalities and within contextual contexts, making it difficult to perform deep-level sentiment correlation mining and information fusion; 2) Sentiment information tends to be unevenly distributed across different modalities, posing challenges in fully leveraging information from each modality for collaborative learning. To address the above challenges, we propose CMLG based on correlation mining and label generation. This approach utilizes a Squeeze and Excitation Network (SEN) to recalibrate modality features and employs Transformer-based intra-modal and inter-modal feature extractors to mine the intrinsic connections between different modalities. In addition, we designed a Self-Supervised Label Generation Module (SLGM) that relies on the positive correlation between feature distances and label offsets to generate single-peak labels, and jointly train multi-peak and single-peak tasks to detect sentiment differences. Extensive experiments on three benchmark dataset (MOSI, MOSEI and SIMS) have shown that the above proposed method CMLG achieves excellent results.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Multimodal Image Fusion via Self-Supervised Transformer
    Zhang, Jing
    Liu, Yu
    Liu, Aiping
    Xie, Qingguo
    Ward, Rabab
    Wang, Z. Jane
    Chen, Xun
    IEEE SENSORS JOURNAL, 2023, 23 (09) : 9796 - 9807
  • [22] TSSFN: Transformer-based self-supervised fusion network for low-quality fundus image enhancement
    Gao, Yinggang
    Zhang, Wanjun
    He, Huifang
    Cao, Lvchen
    Zhang, Yonghua
    Huang, Ziqing
    Zhao, Xiuming
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 89
  • [23] DRSS: a multimodal sentiment analysis approach based on dual representation and self-supervised learning strategy
    Meng, Jing
    Zhu, Zhenfang
    Qi, Jiangtao
    Zhang, Huaxiang
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
  • [24] Vision Transformer-Based Self-supervised Learning for Ulcerative Colitis Grading in Colonoscopy
    Pyatha, Ajay
    Xu, Ziang
    Ali, Sharib
    DATA ENGINEERING IN MEDICAL IMAGING, DEMI 2023, 2023, 14314 : 102 - 110
  • [25] LTFormer: A light-weight transformer-based self-supervised matching network for heterogeneous remote sensing images
    Zhang, Wang
    Li, Tingting
    Zhang, Yuntian
    Pei, Gensheng
    Jiang, Xiruo
    Yao, Yazhou
    INFORMATION FUSION, 2024, 109
  • [26] Multimodal Sentiment Analysis via Low-Rank Tensor Attention Network with Unimodal Self-Supervised Learning
    Pan, Jie (panjie@sdnu.edu.cn), 1600, Institute of Electrical and Electronics Engineers Inc.
  • [27] Self-Supervised Image Denoising of Third Harmonic Generation Microscopic Images of Human Glioma Tissue by Transformer-Based Blind Spot (TBS) Network
    Wu, Yuchen
    Qiu, Siqi
    Groot, Marie Louise
    Zhang, Zhiqing
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (08) : 4688 - 4700
  • [28] Self-supervised multimodal fusion transformer for passive activity recognition
    Koupai, Armand K.
    Bocus, Mohammud J.
    Santos-Rodriguez, Raul
    Piechocki, Robert J.
    McConville, Ryan
    IET WIRELESS SENSOR SYSTEMS, 2022, 12 (5-6) : 149 - 160
  • [29] Self-Supervised Learning based on Sentiment Analysis with Word Weight Calculation
    Son, Dongcheol
    Ko, Youngjoong
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3428 - 3432
  • [30] Self-supervised vision transformer-based few-shot learning for facial expression recognition
    Chen, Xuanchi
    Zheng, Xiangwei
    Sun, Kai
    Liu, Weilong
    Zhang, Yuang
    INFORMATION SCIENCES, 2023, 634 : 206 - 226