Fine-Grained Multimodal DeepFake Classification via Heterogeneous Graphs

被引：0

作者：

Yin, Qilin ^{[1
,2
,3
]}

Lu, Wei ^{[1
,2
,3
]}

Cao, Xiaochun ^{[4
]}

Luo, Xiangyang ^{[5
]}

Zhou, Yicong ^{[6
]}

Huang, Jiwu ^{[7
]}

机构：

[1] Sun Yat sen Univ, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China

[2] Sun Yat sen Univ, Key Lab Informat Technol, Minist Educ, Guangzhou 510006, Peoples R China

[3] Sun Yat Sen Univ, Guangdong Prov Key Lab Informat Secur Technol, Guangzhou 510006, Peoples R China

[4] Sun Yat Sen Univ, Sch Cyber Sci & Technol, Shenzhen 518107, Peoples R China

[5] State Key Lab Math Engn & Adv Comp, Zhengzhou 450001, Henan, Peoples R China

[6] Univ Macau, Dept Comp & Informat Sci, Macau, Peoples R China

[7] Shenzhen MSU BIT Univ, Fac Engn, Guangdong Lab Machine Percept & Intelligent Comp, Shenzhen, Peoples R China

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2024年 / 132卷 / 11期

基金：

中国国家自然科学基金;

关键词：

Multimodal deepfake classification; Audio-visual model; Graph neural networks; Heterogeneous graphs;

D O I：

10.1007/s11263-024-02128-1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Nowadays, the abuse of deepfakes is a well-known issue since deepfakes can lead to severe security and privacy problems. And this situation is getting worse, as attackers are no longer limited to unimodal deepfakes, but use multimodal deepfakes, i.e., both audio forgery and video forgery, to better achieve malicious purposes. The existing unimodal or ensemble deepfake detectors are demanded with fine-grained classification capabilities for the growing technique on multimodal deepfakes. To address this gap, we propose a graph attention network based on heterogeneous graph for fine-grained multimodal deepfake classification, i.e., not only distinguishing the authenticity of samples, but also identifying the forged types, e.g., video or audio or both. To this end, we propose a positional coding-based heterogeneous graph construction method that converts an audio-visual sample into a multimodal heterogeneous graph according to relevant hyperparameters. Moreover, a cross-modal graph interaction module is devised to utilize audio-visual synchronization patterns for capturing inter-modal complementary information. The de-homogenization graph pooling operation is elaborately designed to keep differences in graph node features for enhancing the representation of graph-level features. Through the heterogeneous graph attention network, we can efficiently model intra- and inter-modal relationships of multimodal data both at spatial and temporal scales. Extensive experimental results on two audio-visual datasets FakeAVCeleb and LAV-DF demonstrate that our proposed model obtains significant performance gains as compared to other state-of-the-art competitors. The code is available at https://github.com/yinql1995/Fine-grained-Multimodal-DeepFake-Classification/.

引用

页码：5255 / 5269

页数：15

共 50 条

[1] Deepfake Detection via Fine-Grained Classification and Global-Local Information Fusion
Li, Tonghui
Guo, Yuanfang
Wang, Yunhong
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI, 2024, 14430 : 309 - 321
[2] Fine-grained multimodal named entity recognition with heterogeneous image-text similarity graphs
Wang, Yongpeng
Jiang, Chunmao
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, : 2401 - 2415
[3] Improving Fine-Grained Image Classification With Multimodal Information
Xu, Jie
Zhang, Xiaoqian
Zhao, Changming
Geng, Zili
Feng, Yuren
Miao, Ke
Li, Yunji
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2082 - 2095
[4] Fine-Grained Classification via Categorical Memory Networks
Deng, Weijian
Marsh, Joshua
Gould, Stephen
Zheng, Liang
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4186 - 4196
[5] Fine-Grained Open-Set Deepfake Detection via Unsupervised Domain Adaptation
Zhou, Xinye
Han, Hu
Shan, Shiguang
Chen, Xilin
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 7536 - 7547
[6] Leveraging Fine-Grained Labels to Regularize Fine-Grained Visual Classification
Wu, Junfeng
Yao, Li
Liu, Bin
Ding, Zheyuan
PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON COMPUTER MODELING AND SIMULATION (ICCMS 2019) AND 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND APPLICATIONS (ICICA 2019), 2019, : 133 - 136
[7] Fine-Grained Complexity for Sparse Graphs
Agarwal, Udit
Ramachandran, Vijaya
STOC'18: PROCEEDINGS OF THE 50TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2018, : 239 - 252
[8] Fine-grained Image Classification via Combining Vision and Language
He, Xiangteng
Peng, Yuxin
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 7332 - 7340
[9] Fine-grained Image Classification via Spatial Saliency Extraction
Zhang, Juntan
Sun, Feng-Wen
Song, Jie
Von Ancken, Adam
Zhai, Richard
2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 249 - 255
[10] Diagnosing Necrotizing Enterocolitis via Fine-Grained Visual Classification
Yung, Ka-Wai
Sivaraj, Jayaram
De Coppi, Paolo
Stoyanov, Danail
Loukogeorgakis, Stavros
Mazomenos, Evangelos B.
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2024, 71 (11) : 3160 - 3169

← 1 2 3 4 5 →