ECENet: Explainable and Context-Enhanced Network for Multi-modal Fact Verification

被引：4

作者：

Zhang, Fanrui ^{[1
]}

Liu, Jiawei ^{[1
]}

Zhang, Qiang ^{[1
]}

Sun, Esther ^{[2
]}

Xie, Jingyi ^{[1
]}

Zha, Zheng-Jun ^{[1
]}

机构：

[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China

[2] Univ Toronto, Toronto, ON, Canada

来源：

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Muti-modal fact verification; Attention mechanism; Deep reinforcement learning; Interpretability;

D O I：

10.1145/3581783.3612183

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, falsified claims incorporating both text and images have been disseminated more effectively than those containing text alone, raising significant concerns for multi-modal fact verification. Existing research makes contributions to multi-modal feature extraction and interaction, but fails to fully utilize and enhance the valuable and intricate semantic relationships between distinct features. Moreover, most detectors merely provide a single outcome judgment and lack an inference process or explanation. Taking these factors into account, we propose a novel Explainable and Context-Enhanced Network (ECENet) for multi-modal fact verification, making the first attempt to integrate multi-clue feature extraction, multi-level feature reasoning, and justification (explanation) generation within a unified framework. Specifically, we propose an Improved Coarse- and Fine-grained Attention Network, equipped with two types of level-grained attention mechanisms, to facilitate a comprehensive understanding of contextual information. Furthermore, we propose a novel justification generation module via deep reinforcement learning that does not require additional labels. In this module, a sentence extractor agent measures the importance between the query claim and all document sentences at each time step, selecting a suitable amount of high-scoring sentences to be rewritten as the explanation of the model. Extensive experiments demonstrate the effectiveness of the proposed method.

引用

页码：1231 / 1240

页数：10

共 50 条

[21] AC-E Network: Attentive Context-Enhanced Network for Liver Segmentation
Li, Yang
Zou, Beiji
Dai, Peishan
Liao, Miao
Bai, Harrison X.
Jiao, Zhicheng
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (08) : 4052 - 4061
[22] Explainable Multi-Modal and Local Approaches to Modelling Injuries in Sports Data
Hudson, Dan
Den Hartigh, Ruud. J. R.
Meerhoff, L. Rens A.
Atzmueller, Martin
2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 949 - 957
[23] MulCPred: Learning Multi-Modal Concepts for Explainable Pedestrian Action Prediction
Feng, Yan
Carballo, Alexander
Fujii, Keisuke
Karlsson, Robin
Ding, Ming
Takeda, Kazuya
SENSORS, 2024, 24 (20)
[24] A Modality-Enhanced Multi-Channel Attention Network for Multi-Modal Dialogue Summarization
Lu, Ming
Liu, Yang
Zhang, Xiaoming
APPLIED SCIENCES-BASEL, 2024, 14 (20):
[25] A context-enhanced neural network model for biomedical event trigger detection
Wang, Zilin
Ren, Yafeng
Peng, Qiong
Ji, Donghong
INFORMATION SCIENCES, 2025, 691
[26] Unsupervised Fact-finding with Multi-modal Data in Social Sensing
Shao, Huajie
Yao, Shuochao
Zhao, Yiran
Su, Lu
Wang, Zhibo
Liu, Dongxin
Liu, Shengzhong
Kaplan, Lance
Abdelzaher, Tarek
2019 22ND INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION 2019), 2019,
[27] Hierarchical Attention Network of Multi-Modal Biometric for a Secure Cloud-Based User Verification
Bansong C.
Tseng K.-K.
Yung K.L.
Ip W.H.
IEEE Internet of Things Magazine, 2022, 5 (03): : 122 - 127
[28] An improved Mandarin keyword spotting system using mce training and context-enhanced verification
Liang, JiaEn
Meng, Meng
Wang, XiaoRui
Ding, Peng
Xu, Bo
2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 1145 - 1148
[29] XBully: Cyberbullying Detection within a Multi-Modal Context
Cheng, Lu
Li, Jundong
Silva, Yasin N.
Hall, Deborah L.
Liu, Huan
PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, : 339 - 347
[30] Multi-modal network evolution in polycentric regions
Cats, Oded
Birch, Nigel
JOURNAL OF TRANSPORT GEOGRAPHY, 2021, 96

← 1 2 3 4 5 →