ECENet: Explainable and Context-Enhanced Network for Multi-modal Fact Verification

被引:4
|
作者
Zhang, Fanrui [1 ]
Liu, Jiawei [1 ]
Zhang, Qiang [1 ]
Sun, Esther [2 ]
Xie, Jingyi [1 ]
Zha, Zheng-Jun [1 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] Univ Toronto, Toronto, ON, Canada
来源
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Muti-modal fact verification; Attention mechanism; Deep reinforcement learning; Interpretability;
D O I
10.1145/3581783.3612183
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, falsified claims incorporating both text and images have been disseminated more effectively than those containing text alone, raising significant concerns for multi-modal fact verification. Existing research makes contributions to multi-modal feature extraction and interaction, but fails to fully utilize and enhance the valuable and intricate semantic relationships between distinct features. Moreover, most detectors merely provide a single outcome judgment and lack an inference process or explanation. Taking these factors into account, we propose a novel Explainable and Context-Enhanced Network (ECENet) for multi-modal fact verification, making the first attempt to integrate multi-clue feature extraction, multi-level feature reasoning, and justification (explanation) generation within a unified framework. Specifically, we propose an Improved Coarse- and Fine-grained Attention Network, equipped with two types of level-grained attention mechanisms, to facilitate a comprehensive understanding of contextual information. Furthermore, we propose a novel justification generation module via deep reinforcement learning that does not require additional labels. In this module, a sentence extractor agent measures the importance between the query claim and all document sentences at each time step, selecting a suitable amount of high-scoring sentences to be rewritten as the explanation of the model. Extensive experiments demonstrate the effectiveness of the proposed method.
引用
收藏
页码:1231 / 1240
页数:10
相关论文
共 50 条
  • [31] SEANet: A Multi-modal Speech Enhancement Network
    Tagliasacchi, Marco
    Li, Yunpeng
    Misiunas, Karolis
    Roblek, Dominik
    INTERSPEECH 2020, 2020, : 1126 - 1130
  • [32] Distributed modular toolbox for multi-modal context recognition
    Bannach, D
    Kunze, K
    Lukowicz, P
    Amft, O
    ARCHITECTURE OF COMPUTING SYSTEMS - ARCS 2006, PROCEEDINGS, 2006, 3894 : 99 - 113
  • [33] Multi-modal Experts Network for Autonomous Driving
    Fang, Shihong
    Choromanska, Anna
    2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 6439 - 6445
  • [34] Mineral: Multi-modal Network Representation Learning
    Kefato, Zekarias T.
    Sheikh, Nasrullah
    Montresor, Alberto
    MACHINE LEARNING, OPTIMIZATION, AND BIG DATA, MOD 2017, 2018, 10710 : 286 - 298
  • [35] Towards Multi-Modal Context Recognition for Hearing Instruments
    Tessendorf, Bernd
    Bulling, Andreas
    Roggen, Daniel
    Stiefmeier, Thomas
    Troster, Gerhard
    Feilner, Manuela
    Derleth, Peter
    INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTERS (ISWC) 2010, 2010,
  • [36] Deep Robust Unsupervised Multi-Modal Network
    Yang, Yang
    Wu, Yi-Feng
    Zhan, De-Chuan
    Liu, Zhi-Bin
    Jiang, Yuan
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5652 - 5659
  • [37] A Multi-Modal Transformer network for action detection
    Korban, Matthew
    Youngs, Peter
    Acton, Scott T.
    PATTERN RECOGNITION, 2023, 142
  • [38] Self-supervised multi-modal fusion network for multi-modal thyroid ultrasound image diagnosis
    Xiang, Zhuo
    Zhuo, Qiuluan
    Zhao, Cheng
    Deng, Xiaofei
    Zhu, Ting
    Wang, Tianfu
    Jiang, Wei
    Lei, Baiying
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 150
  • [39] Multi-modal Siamese Network for Entity Alignment
    Chen, Liyi
    Li, Zhi
    Xu, Tong
    Wu, Han
    Wang, Zhefeng
    Yuan, Nicholas Jing
    Chen, Enhong
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 118 - 126
  • [40] GCMR-Net: A Global Context-Enhanced Multi-scale Residual Network for medical image segmentation
    Shi, Anqi
    Shu, Xin
    Xu, Dan
    Wang, Fang
    MULTIMEDIA SYSTEMS, 2025, 31 (01)