SPECTRE Defending Against Backdoor Attacks Using Robust Statistics

被引:0
|
作者
Hayase, Jonathan [1 ]
Kong, Weihao [1 ]
Somani, Raghav [1 ]
Oh, Sewoong [1 ]
机构
[1] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA 98195 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern machine learning increasingly requires training on a large collection of data from multiple sources, not all of which can be trusted. A particularly concerning scenario is when a small fraction of poisoned data changes the behavior of the trained model when triggered by an attacker-specified watermark. Such a compromised model will be deployed unnoticed as the model is accurate otherwise. There have been promising attempts to use the intermediate representations of such a model to separate corrupted examples from clean ones. However, these defenses work only when a certain spectral signature of the poisoned examples is large enough for detection. There is a wide range of attacks that cannot be protected against by the existing defenses. We propose a novel defense algorithm using robust covariance estimation to amplify the spectral signature of corrupted data. This defense provides a clean model, completely removing the backdoor, even in regimes where previous methods have no hope of detecting the poisoned examples.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Automated Segmentation to Make Hidden Trigger Backdoor Attacks Robust against Deep Neural Networks
    Ali, Saqib
    Ashraf, Sana
    Yousaf, Muhammad Sohaib
    Riaz, Shazia
    Wang, Guojun
    APPLIED SCIENCES-BASEL, 2023, 13 (07):
  • [42] DETECTING BACKDOOR ATTACKS AGAINST POINT CLOUD CLASSIFIERS
    Xiang, Zhen
    Miller, David J.
    Chen, Siheng
    Li, Xi
    Kesidis, George
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3159 - 3163
  • [43] Defending Deep Neural Networks Against Backdoor Attack by Using De-Trigger Autoencoder
    Kwon, Hyun
    IEEE ACCESS, 2025, 13 : 11159 - 11169
  • [44] A defense method against backdoor attacks on neural networks
    Kaviani, Sara
    Shamshiri, Samaneh
    Sohn, Insoo
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
  • [45] ShieldNets: Defending Against Adversarial Attacks Using Probabilistic Adversarial Robustness
    Theagarajan, Rajkumar
    Chen, Ming
    Bhanu, Bir
    Zhang, Jing
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6981 - 6989
  • [46] Countermeasures Against Backdoor Attacks Towards Malware Detectors
    Narisada, Shintaro
    Matsumoto, Yuki
    Hidano, Seira
    Uchibayashi, Toshihiro
    Suganuma, Takuo
    Hiji, Masahiro
    Kiyomoto, Shinsaku
    CRYPTOLOGY AND NETWORK SECURITY, CANS 2021, 2021, 13099 : 295 - 314
  • [47] Robust Contrastive Language-Image Pre-training against Data Poisoning and Backdoor Attacks
    Yang, Wenhan
    Gao, Jingdong
    Mirzasoleiman, Baharan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [48] SATYA: Defending Against Adversarial Attacks Using Statistical Hypothesis Testing
    Raj, Sunny
    Pullum, Laura
    Ramanathan, Arvind
    Jha, Sumit Kumar
    FOUNDATIONS AND PRACTICE OF SECURITY (FPS 2017), 2018, 10723 : 277 - 292
  • [49] Defending Adversarial Attacks Against ASV Systems Using Spectral Masking
    Sreekanth, Sankala
    Murty, Kodukula Sri Rama
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, 43 (7) : 4487 - 4507
  • [50] Detecting and Defending against Worm Attacks Using Bot-honeynet
    Yao, Yu
    Lv, Jun-wei
    Gao, Fu-xiang
    Yu, Ge
    Deng, Qing-xu
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, VOL I, 2009, : 260 - 264