Improved Deepfake Video Detection Using Convolutional Vision Transformer

被引:2
|
作者
Deressa, Deressa Wodajo
Lambert, Peter [1 ]
Van Wallendael, Glenn [1 ]
Atnafu, Solomon [2 ]
Mareen, Hannes [1 ]
机构
[1] Univ Ghent, IMEC, IDLab, Dept Elect & Informat Syst, Ghent, Belgium
[2] Addis Ababa Univ, Addis Ababa, Ethiopia
关键词
Deepfake Video Detection; Vision Transformer; Convolutional Neural Network; Misinformation Detection; Multimedia Forensics;
D O I
10.1109/GEM61861.2024.10585593
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Deepfakes are hyper-realistic videos in which the faces are replaced, swapped, or forged using deep-learning models. This potent media manipulation techniques hold promise for applications across various domains. Yet, they also present a significant risk when employed for malicious intents like identity fraud, phishing, spreading false information, and executing scams. In this work, we propose a novel and improved Deepfake video detector that uses a Convolutional Vision Transformer (CViT2), which builds on the concepts of our previous work (CViT). The CViT architecture consists of two components: a Convolutional Neural Network that extracts learnable features, and a Vision Transformer that categorizes these learned features using an attention mechanism. We trained and evaluted our model on 5 datasets, namely Deepfake Detection Challenge Dataset (DFDC), FaceForensics++ (FF++), Celeb-DF v2, Deep-fakeTIMIT, and TrustedMedia. On the test sets unseen during training, we achieved an accuracy of 95%, 94.8%, 98.3% and 76.7% on the DFDC, FF++, Celeb-DF v2, and TIMIT datasets, respectively. In conclusion, our proposed Deepfake detector can be used in the battle against misinformation and other forensic use cases.
引用
收藏
页码:492 / 497
页数:6
相关论文
共 50 条
  • [41] Spatio-Temporal Catcher: a Self-Supervised Transformer for Deepfake Video Detection
    Li, Maosen
    Li, Xurong
    Yu, Kun
    Deng, Cheng
    Huang, Heng
    Mao, Feng
    Xue, Hui
    Li, Minghao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 8707 - 8718
  • [42] A hyperspectral unmixing model using convolutional vision transformer
    Bhakthan, Sreejam Muraleedhara
    Loganathan, Agilandeeswari
    EARTH SCIENCE INFORMATICS, 2024, 17 (03) : 2255 - 2273
  • [43] Spiking Convolutional Vision Transformer
    Talafha, Sameerah
    Rekabdar, Banafsheh
    Mousas, Christos
    Ekenna, Chinwe
    2023 IEEE 17TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC, 2023, : 225 - 226
  • [44] An efficient deepfake video detection using robust deep learning
    Qadir, Abdul
    Mahum, Rabbia
    El-Meligy, Mohammed A.
    Ragab, Adham E.
    AlSalman, Abdulmalik
    Awais, Muhammad
    HELIYON, 2024, 10 (05)
  • [45] Intrusion detection: A model based on the improved vision transformer
    Yang, Yu-Guang
    Fu, Hong-Mei
    Gao, Shang
    Zhou, Yi-Hua
    Shi, Wei-Min
    TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2022, 33 (09)
  • [46] Deepfake Video Detection Based on Improved CapsNet and Temporal-Spatial Features
    Lu, Tianliang
    Bao, Yuxuan
    Li, Lanting
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (01): : 715 - 740
  • [47] Deepfake face detection via multi-level discrete wavelet transform and vision transformer
    Uddin, Main
    Fu, Zhangjie
    Zhang, Xiang
    VISUAL COMPUTER, 2025,
  • [48] Deepfake detection using rationale-augmented convolutional neural network
    Ahmed, Saadaldeen Rashid Ahmed
    Sonuc, Emrullah
    APPLIED NANOSCIENCE, 2021, 13 (2) : 1485 - 1493
  • [49] Adversarially Robust Deepfake Video Detection
    Devasthale, Aditya
    Sural, Shamik
    2022 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2022, : 396 - 403
  • [50] An Intelligent System for Outfall Detection in UAV Images Using Lightweight Convolutional Vision Transformer Network
    Yu, Mingxin
    Zhang, Ji
    Zhu, Lianqing
    Liang, Shengjun
    Lu, Wenshuai
    Ji, Xinglong
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 6265 - 6277