Perceptual Hashing Using Pretrained Vision Transformers

被引:0
|
作者
De Geest, Jelle [1 ]
De Smet, Patrick [2 ]
Bonetto, Lucio [2 ]
Lambert, Peter [1 ]
Van Wallendael, Glenn [1 ]
Mareen, Hannes [1 ]
机构
[1] Univ Ghent, Imec, Dept Elect & Informat Syst, Technol Pk Zwijnaarde 122, B-9052 Ghent, Belgium
[2] Natl Inst Criminalist & Criminol NICC, Vilvoordsesteenweg 100, B-1120 Brussels, Belgium
来源
2024 IEEE GAMING, ENTERTAINMENT, AND MEDIA CONFERENCE, GEM 2024 | 2024年
关键词
Perceptual Hashing; Vision Transformer; Image Forensics;
D O I
10.1109/GEM61861.2024.10585453
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The rapid evolution of digital image circulation has necessitated robust techniques for image identification and comparison, particularly for sensitive applications such as detecting Child Sexual Abuse Material (CSAM) and preventing the spread of harmful content online. Traditional perceptual hashing methods, while useful, fall short when exposed to some common image transformations, or when images are doctored to avoid detection, rendering them ineffective for nuanced comparisons. Addressing this challenge, this paper introduces a novel pretrained vision transformer artificial intelligence (AI) model approach that enhances the robustness and accuracy of perceptual hashing. Leveraging a pretrained Vision Transformer (ViT-L/14), our approach integrates visual and textual data processing to generate feature arrays that represent perceptual image hashes. Through a comprehensive evaluation using a dataset of 50,000 images, we demonstrate that our method offers significant improvements in detecting similarities for certain complex image transformations, aligning more closely with human visual perception than conventional methods. While our method presents certain initial drawbacks such as larger hash sizes and high computational complexity, its ability to better handle perceptual nuances presents a forward step in the realm of image forensics. The potential applications of this research extend to law enforcement, digital media management, and the broader domain of content verification, setting the stage for more secure and efficient digital content analysis.
引用
收藏
页码:19 / 24
页数:6
相关论文
共 50 条
  • [1] Monocular Robot Navigation with Self-Supervised Pretrained Vision Transformers
    Saavedra-Ruiz, Miguel
    Morin, Sacha
    Paull, Liam
    2022 19TH CONFERENCE ON ROBOTS AND VISION (CRV 2022), 2022, : 197 - 204
  • [2] Generating bug-fixes using pretrained transformers
    Drain, Dawn
    Wu, Chen
    Svyatkovskiy, Alexey
    Sundaresan, Neel
    MAPS 2021 - Proceedings of the 5th ACM SIGPLAN International Symposium on Machine Programming, co-located with PLDI 2021, 2021, : 1 - 8
  • [3] ACTION ITEM DETECTION IN MEETINGS USING PRETRAINED TRANSFORMERS
    Sachdeva, Kishan
    Maynez, Joshua
    Siohan, Olivier
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 861 - 868
  • [4] Finetuning Pretrained Transformers into RNNs
    Kasai, Jungo
    Peng, Hao
    Zhang, Yizhe
    Yogatama, Dani
    Ilharco, Gabriel
    Pappas, Nikolaos
    Mao, Yi
    Chen, Weizhu
    Smith, Noah A.
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 10630 - 10643
  • [5] Knowledge Neurons in Pretrained Transformers
    Dai, Damai
    Dong, Li
    Hao, Yaru
    Sui, Zhifang
    Chang, Baobao
    Wei, Furu
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 8493 - 8502
  • [6] Robust Visual Tracking Based on Improved Perceptual Hashing for Robot Vision
    Fei, Mengjuan
    Li, Jing
    Shao, Ling
    Ju, Zhaojie
    Ouyang, Gaoxiang
    INTELLIGENT ROBOTICS AND APPLICATIONS (ICIRA 2015), PT III, 2015, 9246 : 331 - 340
  • [7] Audio content identification by using perceptual hashing
    Lancini, R
    Mapelli, F
    Pezzano, R
    2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 739 - 742
  • [8] Secure Perceptual Hashing of Data using Encryption
    Sahana, M. S.
    2017 INTERNATIONAL CONFERENCE ON CURRENT TRENDS IN COMPUTER, ELECTRICAL, ELECTRONICS AND COMMUNICATION (CTCEEC), 2017, : 524 - 528
  • [9] WALKING DIRECTION IDENTIFICATION USING PERCEPTUAL HASHING
    Verlekar, Tanmay T.
    Correia, Paulo L.
    2016 4TH INTERNATIONAL WORKSHOP ON BIOMETRICS AND FORENSICS (IWBF), 2016,
  • [10] Recasting Generic Pretrained Vision Transformers As Object-Centric Scene Encoders For Manipulation Policies
    Qian, Jianing
    Panagopoulos, Anastasios
    Jayaraman, Dinesh
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 17544 - 17552