Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning

被引:5
|
作者
Tejankar, Ajinkya [1 ]
Sanjabi, Maziar [2 ]
Wang, Qifan [2 ]
Wang, Sinong [2 ]
Firooz, Hamed [2 ]
Pirsiavash, Hamed [1 ]
Tan, Liang [2 ]
机构
[1] Univ Calif Davis, Davis, CA 95616 USA
[2] Meta AI, Delaware, OH USA
关键词
D O I
10.1109/CVPR52729.2023.01178
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, self-supervised learning (SSL) was shown to be vulnerable to patch-based data poisoning backdoor attacks. It was shown that an adversary can poison a small part of the unlabeled data so that when a victim trains an SSL model on it, the final model will have a backdoor that the adversary can exploit. This work aims to defend self-supervised learning against such attacks. We use a three-step defense pipeline, where we first train a model on the poisoned data. In the second step, our proposed defense algorithm (PatchSearch) uses the trained model to search the training data for poisoned samples and removes them from the training set. In the third step, a final model is trained on the cleaned-up training set. Our results show that PatchSearch is an effective defense. As an example, it improves a model's accuracy on images containing the trigger from 38.2% to 63.7% which is very close to the clean model's accuracy, 64.6%. Moreover, we show that PatchSearch outperforms baselines and state-of-the-art defense approaches including those using additional clean, trusted data. Our code is available at https://github.com/UCDvision/PatchSearch
引用
收藏
页码:12239 / 12249
页数:11
相关论文
共 50 条
  • [31] SPECTRE Defending Against Backdoor Attacks Using Robust Statistics
    Hayase, Jonathan
    Kong, Weihao
    Somani, Raghav
    Oh, Sewoong
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [32] Membership Inference Attacks Against Self-supervised Speech Models
    Tseng, Wei-Cheng
    Kao, Wei-Tsung
    Lee, Hung-yi
    INTERSPEECH 2022, 2022, : 5040 - 5044
  • [33] Artemis: Defending Against Backdoor Attacks via Distribution Shift
    Xue, Meng
    Wang, Zhixian
    Zhang, Qian
    Gong, Xueluan
    Liu, Zhihang
    Chen, Yanjiao
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2025, 22 (02) : 1781 - 1795
  • [34] Mitigating Backdoor Attacks in Pre-Trained Encoders via Self-Supervised Knowledge Distillation
    Bie, Rongfang
    Jiang, Jinxiu
    Xie, Hongcheng
    Guo, Yu
    Miao, Yinbin
    Jia, Xiaohua
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2024, 17 (05) : 2613 - 2625
  • [35] Backdoor Attacks against Learning Systems
    Ji, Yujie
    Zhang, Xinyang
    Wang, Ting
    2017 IEEE CONFERENCE ON COMMUNICATIONS AND NETWORK SECURITY (CNS), 2017, : 191 - 199
  • [36] FMDL: Federated Mutual Distillation Learning for Defending Backdoor Attacks
    Sun, Hanqi
    Zhu, Wanquan
    Sun, Ziyu
    Cao, Mingsheng
    Liu, Wenbin
    ELECTRONICS, 2023, 12 (23)
  • [37] Enhancing IoT Network Security: Unveiling the Power of Self-Supervised Learning against DDoS Attacks
    Almaraz-Rivera, Josue Genaro
    Cantoral-Ceballos, Jose Antonio
    Botero, Juan Felipe
    SENSORS, 2023, 23 (21)
  • [38] I Know Your Triggers: Defending Against Textual Backdoor Attacks with Benign Backdoor Augmentation
    Gao, Yue
    Stokes, Jack W.
    Prasad, Manoj Ajith
    Marshall, Andrew T.
    Fawaz, Kassem
    Kiciman, Emre
    2022 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM), 2022,
  • [39] Defending Against Backdoor Attacks by Layer-wise Feature Analysis
    Jebreel, Najeeb Moharram
    Domingo-Ferrer, Josep
    Li, Yiming
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT II, 2023, 13936 : 428 - 440
  • [40] CRAB: CERTIFIED PATCH ROBUSTNESS AGAINST POISONING-BASED BACKDOOR ATTACKS
    Ji, Huxiao
    Li, Jie
    Wu, Chentao
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2486 - 2490