SPECTRE Defending Against Backdoor Attacks Using Robust Statistics

被引:0
|
作者
Hayase, Jonathan [1 ]
Kong, Weihao [1 ]
Somani, Raghav [1 ]
Oh, Sewoong [1 ]
机构
[1] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA 98195 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern machine learning increasingly requires training on a large collection of data from multiple sources, not all of which can be trusted. A particularly concerning scenario is when a small fraction of poisoned data changes the behavior of the trained model when triggered by an attacker-specified watermark. Such a compromised model will be deployed unnoticed as the model is accurate otherwise. There have been promising attempts to use the intermediate representations of such a model to separate corrupted examples from clean ones. However, these defenses work only when a certain spectral signature of the poisoned examples is large enough for detection. There is a wide range of attacks that cannot be protected against by the existing defenses. We propose a novel defense algorithm using robust covariance estimation to amplify the spectral signature of corrupted data. This defense provides a clean model, completely removing the backdoor, even in regimes where previous methods have no hope of detecting the poisoned examples.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] RFVIR: A robust federated algorithm defending against Byzantine attacks
    Wang, Yongkang
    Zhai, Di-Hua
    Xia, Yuanqing
    INFORMATION FUSION, 2024, 105
  • [22] Countermeasure against Backdoor Attacks using Epistemic Classifiers
    Yang, Zhaoyuan
    Virani, Nurali
    Iyer, Naresh S.
    ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS II, 2020, 11413
  • [23] Enhancing robustness of backdoor attacks against backdoor defenses
    Hu, Bin
    Guo, Kehua
    Ren, Sheng
    Fang, Hui
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 269
  • [24] Defending Backdoor Attacks on Vision Transformer via Patch Processing
    Doan, Khoa D.
    Lao, Yingjie
    Yang, Peng
    Li, Ping
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 506 - 515
  • [25] FMDL: Federated Mutual Distillation Learning for Defending Backdoor Attacks
    Sun, Hanqi
    Zhu, Wanquan
    Sun, Ziyu
    Cao, Mingsheng
    Liu, Wenbin
    ELECTRONICS, 2023, 12 (23)
  • [26] Robust Backdoor Attacks against Deep Neural Networks in Real Physical World
    Xue, Mingfu
    He, Can
    Sun, Shichang
    Wang, Jian
    Liu, Weiqiang
    2021 IEEE 20TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2021), 2021, : 620 - 626
  • [27] Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks
    Xi, Zhaohan
    Du, Tianyu
    Li, Changjiang
    Pang, Ren
    Ji, Shouling
    Chen, Jinghui
    Ma, Fenglong
    Wang, Ting
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [28] Backdoor Attacks against Learning Systems
    Ji, Yujie
    Zhang, Xinyang
    Wang, Ting
    2017 IEEE CONFERENCE ON COMMUNICATIONS AND NETWORK SECURITY (CNS), 2017, : 191 - 199
  • [29] Defending Against Adversarial Attacks Using Random Forest
    Ding, Yifan
    Wang, Liqiang
    Zhang, Huan
    Yi, Jinfeng
    Fan, Deliang
    Gong, Boqing
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 105 - 114
  • [30] LIRA: Learnable, Imperceptible and Robust Backdoor Attacks
    Khoa Doan
    Lao, Yingjie
    Zhao, Weijie
    Li, Ping
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 11946 - 11956