Attention-based multi-modal fusion sarcasm detection

被引:1
|
作者
Liu, Jing [1 ]
Tian, Shengwei [1 ]
Yu, Long [2 ]
Long, Jun [3 ,4 ]
Zhou, Tiejun [5 ]
Wang, Bo [1 ]
机构
[1] Xinjiang Univ, Sch Software, Urumqi, Xinjiang, Peoples R China
[2] Xinjiang Univ, Network & Informat Ctr, Urumqi, Xinjiang, Peoples R China
[3] Cent South Univ, Sch Informat Sci & Engn, Changsha, Peoples R China
[4] Cent South Univ, Big Data & Knowledge Engn Inst, Changsha, Peoples R China
[5] Xinjiang Internet Informat Ctr, Urumqi, Xinjiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-modal; sarcasm detection; Attention; ViT; D-BiGRU;
D O I
10.3233/JIFS-213501
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sarcasm is a way to express the thoughts of a person. The intended meaning of the ideas expressed through sarcasm is often the opposite of the apparent meaning. Previous work on sarcasm detection mainly focused on the text. But nowadays most information is multi-modal, including text and images. Therefore, the task of targeting multi-modal sarcasm detection is becoming an increasingly hot research topic. In order to better detect the accurate meaning of multi-modal sarcasm information, this paper proposed a multi-modal fusion sarcasm detection model based on the attention mechanism, which introduced Vision Transformer (ViT) to extract image features and designed a Double-Layer Bi-Directional Gated Recurrent Unit (D-BiGRU) to extract text features. The features of the two modalities are fused into one feature vector and predicted after attention enhancement. The model presented in this paper gained significant experimental results on the baseline datasets, which are 0.71% and 0.38% higher than that of the best baseline model proposed on F1-score and accuracy respectively.
引用
收藏
页码:2097 / 2108
页数:12
相关论文
共 50 条
  • [21] AMM-FuseNet: Attention-Based Multi-Modal Image Fusion Network for Land Cover Mapping
    Ma, Wanli
    Karaku, Oktay
    Rosin, Paul L.
    REMOTE SENSING, 2022, 14 (18)
  • [22] AGREE: Attention-Based Tour Group Recommendation with Multi-modal Data
    Hu, Fang
    Huang, Xiuqi
    Gao, Xiaofeng
    Chen, Guihai
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2019, 11448 : 314 - 318
  • [23] MAF-YOLO: Multi-modal attention fusion based YOLO for pedestrian detection
    Xue, Yongjie
    Ju, Zhiyong
    Li, Yuming
    Zhang, Wenxin
    INFRARED PHYSICS & TECHNOLOGY, 2021, 118
  • [24] Multi-Modal Sarcasm Detection Based on Cross-Modal Composition of Inscribed Entity Relations
    Li, Lingshan
    Jin, Di
    Wang, Xiaobao
    Guo, Fengyu
    Wang, Longbiao
    Dang, Jianwu
    2023 IEEE 35TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2023, : 918 - 925
  • [25] Multi-modal Perception Fusion Method Based on Cross Attention
    Zhang B.-L.
    Pan Z.-H.
    Jiang J.-Z.
    Zhang C.-B.
    Wang Y.-X.
    Yang C.-L.
    Zhongguo Gonglu Xuebao/China Journal of Highway and Transport, 2024, 37 (03): : 181 - 193
  • [26] AGGN: Attention-based glioma grading network with multi-scale feature extraction and multi-modal information fusion
    Wu, Peishu
    Wang, Zidong
    Zheng, Baixun
    Li, Han
    Alsaadi, Fuad E.
    Zeng, Nianyin
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 152
  • [27] Multi-Modal Sarcasm Detection with Interactive In-Modal and Cross-Modal Graphs
    Liang, Bin
    Lou, Chenwei
    Li, Xiang
    Gui, Lin
    Yang, Min
    Xu, Ruifeng
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4707 - 4715
  • [28] Multi-modal sarcasm detection using ensemble net model
    Sukhavasi, Vidyullatha
    Dondeti, Venkatesulu
    KNOWLEDGE AND INFORMATION SYSTEMS, 2025, 67 (01) : 403 - 425
  • [29] Vehicle Detection of Multi-Modal Attention Fusion Under Different Illumination
    Wang, Jiaqi
    Zhang, Qi
    Huang, Wei
    Computer Engineering and Applications, 2024, 60 (16) : 116 - 123
  • [30] A novel transformer attention-based approach for sarcasm detection
    Khan, Shumaila
    Qasim, Iqbal
    Khan, Wahab
    Aurangzeb, Khursheed
    Khan, Javed Ali
    Anwar, Muhammad Shahid
    EXPERT SYSTEMS, 2025, 42 (01)