Transformer-Based Interactive Multi-Modal Attention Network for Video Sentiment Detection

被引:0
|
作者
Xuqiang Zhuang
Fangai Liu
Jian Hou
Jianhua Hao
Xiaohong Cai
机构
[1] Shandong Normal University,School of Information Science and Engineering
[2] Shandong University of Traditional Chinese Medicine,School of Intelligence and Information Engineering
来源
Neural Processing Letters | 2022年 / 54卷
关键词
Multimodal; Transformer; Sentiment detection;
D O I
暂无
中图分类号
学科分类号
摘要
Social media allows users to express opinions in multiple modalities such as text, pictures, and short-videos. Multi-modal sentiment detection can more effectively predict the emotional tendencies expressed by users. Therefore, multi-modal sentiment detection has received extensive attention in recent years. Current works consider utterances from videos as independent modal, ignoring the effective interaction among diffence modalities of a video. To tackle these challenges, we propose transformer-based interactive multi-modal attention network to investigate multi-modal paired attention between multiple modalities and utterances for video sentiment detection. Specifically, we first take a series of utterances as input and use three separate transformer encoders to capture the utterances-level features of each modality. Subsequently, we introduced multimodal paired attention mechanisms to learn the cross-modality information between multiple modalities and utterances. Finally, we inject the cross-modality information into the multi-headed self-attention layer for making final emotion and sentiment classification. Our solutions outperform baseline models on three multi-modal datasets.
引用
收藏
页码:1943 / 1960
页数:17
相关论文
共 50 条
  • [21] UniTR: A Unified TRansformer-Based Framework for Co-Object and Multi-Modal Saliency Detection
    Guo, Ruohao
    Ying, Xianghua
    Qi, Yanyu
    Qu, Liao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7622 - 7635
  • [22] Pedestrian Crossing Intention Prediction with Multi-Modal Transformer-Based Model
    Wang, Ting Wei
    Lai, Shang-Hong
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 1349 - 1356
  • [23] MBIAN: Multi-level bilateral interactive attention network for multi-modal
    Sun, Kai
    Zhang, Jiangshe
    Wang, Jialin
    Xu, Shuang
    Zhang, Chunxia
    Hu, Junying
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 231
  • [24] Video Review Analysis via Transformer-based Sentiment Change Detection
    Wu, Zilong
    Huang, Siyuan
    Zhang, Rui
    Li, Lin
    THIRD INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2020), 2020, : 330 - 335
  • [25] Attention-based Multi-modal Sentiment Analysis and Emotion Detection in Conversation using RNN
    Huddar, Mahesh G.
    Sannakki, Sanjeev S.
    Rajpurohit, Vijay S.
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2021, 6 (06): : 112 - 121
  • [26] Multi-modal fusion attention sentiment analysis for mixed sentiment classification
    Xue, Zhuanglin
    Xu, Jiabin
    COGNITIVE COMPUTATION AND SYSTEMS, 2024,
  • [27] Transformer-based Automatic Music Mood Classification Using Multi-modal Framework
    Kumar, Sujeesha Ajithakumari Suresh
    Rajan, Rajeev
    JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY, 2023, 23 (01): : 18 - 34
  • [28] TransOrga: End-To-End Multi-modal Transformer-Based Organoid Segmentation
    Qin, Yiming
    Li, Jiajia
    Chen, Yulong
    Wang, Zikai
    Huang, Yu-An
    You, Zhuhong
    Hu, Lun
    Hu, Pengwei
    Tan, Feng
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT III, 2023, 14088 : 460 - 472
  • [29] Contextual Inter-modal Attention for Multi-modal Sentiment Analysis
    Ghosal, Deepanway
    Akhtar, Md Shad
    Chauhan, Dushyant
    Poria, Soujanya
    Ekbalt, Asif
    Bhattacharyyat, Pushpak
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3454 - 3466
  • [30] Mixture of Attention Variants for Modal Fusion in Multi-Modal Sentiment Analysis
    He, Chao
    Zhang, Xinghua
    Song, Dongqing
    Shen, Yingshan
    Mao, Chengjie
    Wen, Huosheng
    Zhu, Dingju
    Cai, Lihua
    BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (02)