Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning

被引：5

作者：

Tejankar, Ajinkya ^{[1
]}

Sanjabi, Maziar ^{[2
]}

Wang, Qifan ^{[2
]}

Wang, Sinong ^{[2
]}

Firooz, Hamed ^{[2
]}

Pirsiavash, Hamed ^{[1
]}

Tan, Liang ^{[2
]}

机构：

[1] Univ Calif Davis, Davis, CA 95616 USA

[2] Meta AI, Delaware, OH USA

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年

关键词：

D O I：

10.1109/CVPR52729.2023.01178

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, self-supervised learning (SSL) was shown to be vulnerable to patch-based data poisoning backdoor attacks. It was shown that an adversary can poison a small part of the unlabeled data so that when a victim trains an SSL model on it, the final model will have a backdoor that the adversary can exploit. This work aims to defend self-supervised learning against such attacks. We use a three-step defense pipeline, where we first train a model on the poisoned data. In the second step, our proposed defense algorithm (PatchSearch) uses the trained model to search the training data for poisoned samples and removes them from the training set. In the third step, a final model is trained on the cleaned-up training set. Our results show that PatchSearch is an effective defense. As an example, it improves a model's accuracy on images containing the trigger from 38.2% to 63.7% which is very close to the clean model's accuracy, 64.6%. Moreover, we show that PatchSearch outperforms baselines and state-of-the-art defense approaches including those using additional clean, trusted data. Our code is available at https://github.com/UCDvision/PatchSearch

引用

页码：12239 / 12249

页数：11

共 50 条

[31] SPECTRE Defending Against Backdoor Attacks Using Robust Statistics
Hayase, Jonathan
Kong, Weihao
Somani, Raghav
Oh, Sewoong
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[32] Membership Inference Attacks Against Self-supervised Speech Models
Tseng, Wei-Cheng
Kao, Wei-Tsung
Lee, Hung-yi
INTERSPEECH 2022, 2022, : 5040 - 5044
[33] Artemis: Defending Against Backdoor Attacks via Distribution Shift
Xue, Meng
Wang, Zhixian
Zhang, Qian
Gong, Xueluan
Liu, Zhihang
Chen, Yanjiao
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2025, 22 (02) : 1781 - 1795
[34] Mitigating Backdoor Attacks in Pre-Trained Encoders via Self-Supervised Knowledge Distillation
Bie, Rongfang
Jiang, Jinxiu
Xie, Hongcheng
Guo, Yu
Miao, Yinbin
Jia, Xiaohua
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2024, 17 (05) : 2613 - 2625
[35] Backdoor Attacks against Learning Systems
Ji, Yujie
Zhang, Xinyang
Wang, Ting
2017 IEEE CONFERENCE ON COMMUNICATIONS AND NETWORK SECURITY (CNS), 2017, : 191 - 199
[36] FMDL: Federated Mutual Distillation Learning for Defending Backdoor Attacks
Sun, Hanqi
Zhu, Wanquan
Sun, Ziyu
Cao, Mingsheng
Liu, Wenbin
ELECTRONICS, 2023, 12 (23)
[37] Enhancing IoT Network Security: Unveiling the Power of Self-Supervised Learning against DDoS Attacks
Almaraz-Rivera, Josue Genaro
Cantoral-Ceballos, Jose Antonio
Botero, Juan Felipe
SENSORS, 2023, 23 (21)
[38] I Know Your Triggers: Defending Against Textual Backdoor Attacks with Benign Backdoor Augmentation
Gao, Yue
Stokes, Jack W.
Prasad, Manoj Ajith
Marshall, Andrew T.
Fawaz, Kassem
Kiciman, Emre
2022 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM), 2022,
[39] Defending Against Backdoor Attacks by Layer-wise Feature Analysis
Jebreel, Najeeb Moharram
Domingo-Ferrer, Josep
Li, Yiming
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT II, 2023, 13936 : 428 - 440
[40] CRAB: CERTIFIED PATCH ROBUSTNESS AGAINST POISONING-BASED BACKDOOR ATTACKS
Ji, Huxiao
Li, Jie
Wu, Chentao
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2486 - 2490

← 1 2 3 4 5 →