A Multimodal Sentiment Analysis Model Enhanced with Non-verbal Information and Contrastive Learning

被引:0
|
作者
Liu, Jia [1 ,2 ,3 ,4 ]
Song, Hong [1 ,2 ]
Chen, Dapeng [1 ,2 ,3 ,4 ]
Wang, Bin [1 ,2 ]
Zhang, Zengwei [1 ,2 ]
机构
[1] Tianchang Research Institute, Nanjing University of Information Science & Technology, Chuzhou,239300, China
[2] School of Automation, Nanjing University of Information Science & Technology, Nanjing,210044, China
[3] Jiangsu Province Engineering Research Center of Intelligent Meteorological Exploration Robot, Nanjing,210044, China
[4] Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, Nanjing,210044, China
基金
中国国家自然科学基金;
关键词
Deep learning - Semantics;
D O I
10.11999/JEIT231274
中图分类号
学科分类号
摘要
Deep learning methods have gained popularity in multimodal sentiment analysis due to their impressive representation and fusion capabilities in recent years. Existing studies often analyze the emotions of individuals using multimodal information such as text, facial expressions, and speech intonation, primarily employing complex fusion methods. However, existing models inadequately consider the dynamic changes in emotions over long time sequences, resulting in suboptimal performance in sentiment analysis. In response to this issue, a Multimodal Sentiment Analysis Model Enhanced with Non-verbal Information and Contrastive Learning is proposed in this paper. Firstly, the paper employs long-term textual information to enable the model to learn dynamic changes in audio and video across extended time sequences. Subsequently, a gating mechanism is employed to eliminate redundant information and semantic ambiguity between modalities. Finally, contrastive learning is applied to strengthen the interaction between modalities, enhancing the model’s generalization. Experimental results demonstrate that on the CMU-MOSI dataset, the model improves the Pearson Correlation coefficient (Corr) and F1 score by 3.7% and 2.1%, respectively. On the CMU-MOSEI dataset, the model increases Corr and F1 score by 1.4% and 1.1%, respectively. Therefore, the proposed model effectively utilizes intermodal interaction information while eliminating information redundancy. © 2024 Science Press. All rights reserved.
引用
收藏
页码:3372 / 3381
相关论文
共 50 条
  • [1] Multimodal hypergraph network with contrastive learning for sentiment analysis
    Huang, Jian
    Jiang, Kun
    Pu, Yuanyuan
    Zhao, Zhengpeng
    Yang, Qiuxia
    Gu, Jinjing
    Xu, Dan
    NEUROCOMPUTING, 2025, 627
  • [2] MSA-HCL: MULTIMODAL SENTIMENT ANALYSIS MODEL WITH HYBRID CONTRASTIVE LEARNING
    Zhao, Wang
    Zhang, Yong
    Hua, Qiang
    Dong, Chun-ru
    Wang, Jia-nan
    Zhang, Feng
    MATHEMATICAL FOUNDATIONS OF COMPUTING, 2025, 8 (03): : 433 - 447
  • [3] Aspect-Based Sentiment Analysis Model of Multimodal Collaborative Contrastive Learning
    Yu, Bengong
    Xing, Yu
    Zhang, Shuwen
    Data Analysis and Knowledge Discovery, 2024, 8 (11) : 22 - 32
  • [4] Leveraging Vision-Language Pre-Trained Model and Contrastive Learning for Enhanced Multimodal Sentiment Analysis
    An, Jieyu
    Zainon, Wan Mohd Nazmee Wan
    Ding, Binfen
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (02): : 1673 - 1689
  • [5] Dynamic Weighted Multitask Learning and Contrastive Learning for Multimodal Sentiment Analysis
    Wang, Xingqi
    Zhang, Mengrui
    Chen, Bin
    Wei, Dan
    Shao, Yanli
    ELECTRONICS, 2023, 12 (13)
  • [6] Text-Centric Multimodal Contrastive Learning for Sentiment Analysis
    Peng, Heng
    Gu, Xue
    Li, Jian
    Wang, Zhaodan
    Xu, Hao
    ELECTRONICS, 2024, 13 (06)
  • [7] Analysis and synthesis of multimodal verbal and non-verbal interaction for animated interface agents
    Beskow, Jonas
    Granstrom, Bjorn
    House, David
    VERBAL AND NONVERBAL COMMUNICATION BEHAVIOURS, 2007, 4775 : 250 - +
  • [8] COMPARATIVE ANALYSIS OF VERBAL AND NON-VERBAL METHODS TO OBTAIN MANAGERIAL INFORMATION
    Lobanova, E. N.
    RUDN JOURNAL OF SOCIOLOGY-VESTNIK ROSSIISKOGO UNIVERSITETA DRUZHBY NARODOV SERIYA SOTSIOLOGIYA, 2013, (04): : 117 - 126
  • [9] THE COHERENCE OF VERBAL AND NON-VERBAL COMMUNICATION ELEMENTS IN THE POLITICAL TV TALK SHOWS: CONTRASTIVE ANALYSIS
    Poskiene, Audrone
    PSYCHOLOGY AND PSYCHIATRY, SOCIOLOGY AND HEALTHCARE, EDUCATION, VOL II, 2015, : 969 - 980
  • [10] Non-verbal learning disabilities: The syndrome and the model.
    Szatmari, P
    Morgan, J
    CANADIAN JOURNAL OF PSYCHIATRY-REVUE CANADIENNE DE PSYCHIATRIE, 1997, 42 (10): : 1081 - 1081