Improved Deepfake Video Detection Using Convolutional Vision Transformer

被引：2

作者：

Deressa, Deressa Wodajo

Lambert, Peter ^{[1
]}

Van Wallendael, Glenn ^{[1
]}

Atnafu, Solomon ^{[2
]}

Mareen, Hannes ^{[1
]}

机构：

[1] Univ Ghent, IMEC, IDLab, Dept Elect & Informat Syst, Ghent, Belgium

[2] Addis Ababa Univ, Addis Ababa, Ethiopia

来源：

2024 IEEE GAMING, ENTERTAINMENT, AND MEDIA CONFERENCE, GEM 2024 | 2024年

关键词：

Deepfake Video Detection; Vision Transformer; Convolutional Neural Network; Misinformation Detection; Multimedia Forensics;

D O I：

10.1109/GEM61861.2024.10585593

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Deepfakes are hyper-realistic videos in which the faces are replaced, swapped, or forged using deep-learning models. This potent media manipulation techniques hold promise for applications across various domains. Yet, they also present a significant risk when employed for malicious intents like identity fraud, phishing, spreading false information, and executing scams. In this work, we propose a novel and improved Deepfake video detector that uses a Convolutional Vision Transformer (CViT2), which builds on the concepts of our previous work (CViT). The CViT architecture consists of two components: a Convolutional Neural Network that extracts learnable features, and a Vision Transformer that categorizes these learned features using an attention mechanism. We trained and evaluted our model on 5 datasets, namely Deepfake Detection Challenge Dataset (DFDC), FaceForensics++ (FF++), Celeb-DF v2, Deep-fakeTIMIT, and TrustedMedia. On the test sets unseen during training, we achieved an accuracy of 95%, 94.8%, 98.3% and 76.7% on the DFDC, FF++, Celeb-DF v2, and TIMIT datasets, respectively. In conclusion, our proposed Deepfake detector can be used in the battle against misinformation and other forensic use cases.

引用

页码：492 / 497

页数：6

共 50 条

[21] Cascaded Network Based on EfficientNet and Transformer for Deepfake Video Detection
Liwei Deng
Jiandong Wang
Zhen Liu
Neural Processing Letters, 2023, 55 : 7057 - 7076
[22] MSVT: Multiple Spatiotemporal Views Transformer for DeepFake Video Detection
Yu, Yang
Ni, Rongrong
Zhao, Yao
Yang, Siyuan
Xia, Fen
Jiang, Ning
Zhao, Guoqing
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 4462 - 4471
[23] Recurrent Convolutional Structures for Audio Spoof and Video Deepfake Detection
Chintha, Akash
Thai, Bao
Sohrawardi, Saniat Javid
Bhatt, Kartavya
Hickerson, Andrea
Wright, Matthew
Ptucha, Raymond
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (05) : 1024 - 1037
[24] Deepfake Video Detection Method Improved by GRU and Involution
Liu, Yalin
Lu, Tianliang
Computer Engineering and Applications, 2023, 59 (22) : 276 - 283
[25] Spatio-temporal knowledge distilled video vision transformer (STKD-VViT) for multimodal deepfake detection
Usmani, Shaheen
Kumar, Sunil
Sadhya, Debanjan
NEUROCOMPUTING, 2025, 620
[26] ISTVT: Interpretable Spatial-Temporal Video Transformer for Deepfake Detection
Zhao, Cairong
Wang, Chutian
Hu, Guosheng
Chen, Haonan
Liu, Chun
Tang, Jinhui
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 1335 - 1348
[27] DEEPFAKE VIDEO DETECTION USING 3D-ATTENTIONAL INCEPTION CONVOLUTIONAL NEURAL NETWORK
Lu, Changlei
Liu, Bin
Zhou, Wenbo
Chu, Qi
Yu, Nenghai
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3572 - 3576
[28] Hybrid network of convolutional neural network and transformer for deepfake geographic image detection
Liu, Xiaoyong
Dong, Xiaofei
Xie, Feng
Lu, Pei
Lu, Xi
Jiang, Mingzhong
JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (02)
[29] A Method for Deepfake Detection Using Convolutional Neural Networks
Volkova, S. S.
SCIENTIFIC AND TECHNICAL INFORMATION PROCESSING, 2023, 50 (05) : 475 - 485
[30] Video deepfake detection using Particle Swarm Optimization improved deep neural networks
Leandro Cunha
Li Zhang
Bilal Sowan
Chee Peng Lim
Yinghui Kong
Neural Computing and Applications, 2024, 36 : 8417 - 8453

← 1 2 3 4 5 →