Video multimodal sentiment analysis using cross-modal feature translation and dynamical propagation

被引:2
|
作者
Gan, Chenquan [1 ,2 ,3 ]
Tang, Yu [1 ]
Fu, Xiang [1 ]
Zhu, Qingyi [2 ]
Jain, Deepak Kumar [4 ,5 ]
Garcia, Salvador [6 ]
机构
[1] Chongqing Univ Posts & Telecommun, Sch Commun & Informat Engn, Chongqing 400065, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Sch Cyber Secur & Informat Law, Chongqing 400065, Peoples R China
[3] Chongqing Univ Posts & Telecommun, Key Lab Big Data Intelligent Comp, Chongqing 400065, Peoples R China
[4] Dalian Univ Technol, Key Lab Intelligent Control & Optimizat Ind Equipm, Minist Educ, Dalian 116024, Peoples R China
[5] Symbiosis Int Univ, Symbiosis Inst Technol, Pune 412115, India
[6] Univ Granada, Andalusian Res Inst Data Sci & Computat Intelligen, Dept Comp Sci & Artificial Intelligence, Granada 18071, Spain
关键词
Video multimodal sentiment analysis; Public emotion feature; Cross-modal feature translation; Dynamical propagation model;
D O I
10.1016/j.knosys.2024.111982
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal sentiment analysis on social platforms is crucial for comprehending public opinions and attitudes, thus garnering substantial interest in knowledge engineering. Existing methods like implicit interaction, explicit interaction, and cross -modal translation can effectively integrate sentiment information, but they encounter challenges in establishing efficient emotional correlations across modalities due to data heterogeneity and concealed emotional relationships. To tackle this issue, we propose a video multimodal sentiment analysis model called PEST, which leverages cross -modal feature translation and a dynamic propagation model. Specifically, cross -modal feature translation translates textual, visual, and acoustic features into a common feature space, eliminating heterogeneity and enabling initial modal interaction. Additionally, the dynamic propagation model facilitates in-depth interaction and aids in establishing stable and reliable emotional correlations across modalities. Extensive experiments on the three multimodal sentiment datasets, CMU-MOSI, CMU-MOSEI, and CH-SIMS, demonstrate that PEST exhibits superior performance in both word -aligned and unaligned settings.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Cross-Modal Translation and Alignment for Survival Analysis
    Zhou, Fengtao
    Chen, Hao
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21428 - 21437
  • [32] deepsing: Generating sentiment-aware visual stories using cross-modal music translation
    Passalis, Nikolaos
    Doropoulos, Stavros
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 164
  • [33] Multimodal sentiment analysis model based on multi-task learning and stacked cross-modal Transformer
    Chen Q.-H.
    Sun J.-J.
    Lou Y.-B.
    Fang Z.-J.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2023, 57 (12): : 2421 - 2429
  • [34] Text-dominant multimodal perception network for sentiment analysis based on cross-modal semantic enhancements
    Li, Zuhe
    Liu, Panbo
    Pan, Yushan
    Yu, Jun
    Liu, Weihua
    Chen, Haoran
    Luo, Yiming
    Wang, Hao
    APPLIED INTELLIGENCE, 2025, 55 (02)
  • [35] Cross-Modal Multitask Transformer for End-to-End Multimodal Aspect-Based Sentiment Analysis
    Yang, Li
    Na, Jin-Cheon
    Yu, Jianfei
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (05)
  • [37] MemoCMT: multimodal emotion recognition using cross-modal transformer-based feature fusion
    Khan, Mustaqeem
    Tran, Phuong-Nam
    Pham, Nhat Truong
    El Saddik, Abdulmotaleb
    Othmani, Alice
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [38] A Text-Centered Shared-Private Framework via Cross-Modal Prediction for Multimodal Sentiment Analysis
    Wu, Yang
    Lin, Zijie
    Zhao, Yanyan
    Qin, Bing
    Zhu, Li-Nan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4730 - 4738
  • [39] Cross-Modal Retrieval using Random Multimodal Deep Learning
    Somasekar, Hemanth
    Naveen, Kavya
    JOURNAL OF MECHANICS OF CONTINUA AND MATHEMATICAL SCIENCES, 2019, 14 (02): : 185 - 200
  • [40] KERNEL CROSS-MODAL FACTOR ANALYSIS FOR MULTIMODAL INFORMATION FUSION
    Wang, Yongjin
    Guan, Ling
    Venetsanopoulos, A. N.
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 2384 - 2387