Attention-based multi-modal fusion sarcasm detection

被引:1
|
作者
Liu, Jing [1 ]
Tian, Shengwei [1 ]
Yu, Long [2 ]
Long, Jun [3 ,4 ]
Zhou, Tiejun [5 ]
Wang, Bo [1 ]
机构
[1] Xinjiang Univ, Sch Software, Urumqi, Xinjiang, Peoples R China
[2] Xinjiang Univ, Network & Informat Ctr, Urumqi, Xinjiang, Peoples R China
[3] Cent South Univ, Sch Informat Sci & Engn, Changsha, Peoples R China
[4] Cent South Univ, Big Data & Knowledge Engn Inst, Changsha, Peoples R China
[5] Xinjiang Internet Informat Ctr, Urumqi, Xinjiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-modal; sarcasm detection; Attention; ViT; D-BiGRU;
D O I
10.3233/JIFS-213501
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sarcasm is a way to express the thoughts of a person. The intended meaning of the ideas expressed through sarcasm is often the opposite of the apparent meaning. Previous work on sarcasm detection mainly focused on the text. But nowadays most information is multi-modal, including text and images. Therefore, the task of targeting multi-modal sarcasm detection is becoming an increasingly hot research topic. In order to better detect the accurate meaning of multi-modal sarcasm information, this paper proposed a multi-modal fusion sarcasm detection model based on the attention mechanism, which introduced Vision Transformer (ViT) to extract image features and designed a Double-Layer Bi-Directional Gated Recurrent Unit (D-BiGRU) to extract text features. The features of the two modalities are fused into one feature vector and predicted after attention enhancement. The model presented in this paper gained significant experimental results on the baseline datasets, which are 0.71% and 0.38% higher than that of the best baseline model proposed on F1-score and accuracy respectively.
引用
收藏
页码:2097 / 2108
页数:12
相关论文
共 50 条
  • [11] Automatic depression prediction via cross-modal attention-based multi-modal fusion in social networks
    Wang, Lidong
    Zhang, Yin
    Zhou, Bin
    Cao, Shihua
    Hu, Keyong
    Tan, Yunfei
    COMPUTERS & ELECTRICAL ENGINEERING, 2024, 118
  • [12] Attention-based Multi-modal Sentiment Analysis and Emotion Detection in Conversation using RNN
    Huddar, Mahesh G.
    Sannakki, Sanjeev S.
    Rajpurohit, Vijay S.
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2021, 6 (06): : 112 - 121
  • [13] Multi-modal Sarcasm Detection on Social Media via Multi-Granularity Information Fusion
    Ou, Lisong
    Li, Zhixin
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2025, 21 (03)
  • [14] Multi-Modal Sarcasm Detection with Sentiment Word Embedding
    Fu, Hao
    Liu, Hao
    Wang, Hongling
    Xu, Linyan
    Lin, Jiali
    Jiang, Dazhi
    ELECTRONICS, 2024, 13 (05)
  • [15] Attention-based multi-modal fusion for improved real estate appraisal: a case study in Los Angeles
    Junchi Bin
    Bryan Gardiner
    Zheng Liu
    Eric Li
    Multimedia Tools and Applications, 2019, 78 : 31163 - 31184
  • [16] A Multi-Modal Attention-Based Approach for Points of Interest Detection on 3D Shapes
    Shu, Zhenyu
    Yu, Junlong
    Chao, Kai
    Xin, Shiqing
    Liu, Ligang
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2025, 31 (03) : 1698 - 1712
  • [17] Attention-based multi-modal fusion for improved real estate appraisal: a case study in Los Angeles
    Bin, Junchi
    Gardiner, Bryan
    Liu, Zheng
    Li, Eric
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (22) : 31163 - 31184
  • [18] Attention-based Fusion Network for Breast Cancer Segmentation and Classification Using Multi-modal Ultrasound Images
    Cho, Yoonjae
    Misra, Sampa
    Managuli, Ravi
    Barr, Richard G.
    Lee, Jeongmin
    Kim, Chulhong
    ULTRASOUND IN MEDICINE AND BIOLOGY, 2025, 51 (03): : 568 - 577
  • [19] High-Resolution Depth Maps Imaging via Attention-Based Hierarchical Multi-Modal Fusion
    Zhong, Zhiwei
    Liu, Xianming
    Jiang, Junjun
    Zhao, Debin
    Chen, Zhiwen
    Ji, Xiangyang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 648 - 663
  • [20] Fuel consumption prediction for pre-departure flights using attention-based multi-modal fusion
    Lin, Yi
    Guo, Dongyue
    Wu, Yuankai
    Li, Lishuai
    Wu, Edmond Q.
    Ge, Wenyi
    INFORMATION FUSION, 2024, 101