SPECTRE Defending Against Backdoor Attacks Using Robust Statistics

被引:0
|
作者
Hayase, Jonathan [1 ]
Kong, Weihao [1 ]
Somani, Raghav [1 ]
Oh, Sewoong [1 ]
机构
[1] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA 98195 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern machine learning increasingly requires training on a large collection of data from multiple sources, not all of which can be trusted. A particularly concerning scenario is when a small fraction of poisoned data changes the behavior of the trained model when triggered by an attacker-specified watermark. Such a compromised model will be deployed unnoticed as the model is accurate otherwise. There have been promising attempts to use the intermediate representations of such a model to separate corrupted examples from clean ones. However, these defenses work only when a certain spectral signature of the poisoned examples is large enough for detection. There is a wide range of attacks that cannot be protected against by the existing defenses. We propose a novel defense algorithm using robust covariance estimation to amplify the spectral signature of corrupted data. This defense provides a clean model, completely removing the backdoor, even in regimes where previous methods have no hope of detecting the poisoned examples.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] An adaptive robust defending algorithm against backdoor attacks in federated learning
    Wang, Yongkang
    Zhai, Di-Hua
    He, Yongping
    Xia, Yuanqing
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 143 : 118 - 131
  • [2] Defending Against Backdoor Attacks by Quarantine Training
    Yu, Chengxu
    Zhang, Yulai
    IEEE ACCESS, 2024, 12 : 10681 - 10689
  • [3] Defending against Backdoor Attacks in Natural Language Generation
    Sun, Xiaofei
    Li, Xiaoya
    Meng, Yuxian
    Ao, Xiang
    Lyu, Lingjuan
    Li, Jiwei
    Zhang, Tianwei
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 4, 2023, : 5257 - 5265
  • [4] Invariant Aggregator for Defending against Federated Backdoor Attacks
    Wang, Xiaoyang
    Dimitriadis, Dimitrios
    Koyejo, Sanmi
    Tople, Shruti
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [5] FedPD: Defending federated prototype learning against backdoor attacks
    Tan, Zhou
    Cai, Jianping
    Li, De
    Lian, Puwei
    Liu, Ximeng
    Che, Yan
    NEURAL NETWORKS, 2025, 184
  • [6] RoPE: Defending against backdoor attacks in federated learning systems
    Wang, Yongkang
    Zhai, Di-Hua
    Xia, Yuanqing
    KNOWLEDGE-BASED SYSTEMS, 2024, 293
  • [7] DEFENDING AGAINST BACKDOOR ATTACKS IN FEDERATED LEARNING WITH DIFFERENTIAL PRIVACY
    Miao, Lu
    Yang, Wei
    Hu, Rong
    Li, Lu
    Huang, Liusheng
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2999 - 3003
  • [8] Artemis: Defending Against Backdoor Attacks via Distribution Shift
    Xue, Meng
    Wang, Zhixian
    Zhang, Qian
    Gong, Xueluan
    Liu, Zhihang
    Chen, Yanjiao
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2025, 22 (02) : 1781 - 1795
  • [9] Defending Against Data and Model Backdoor Attacks in Federated Learning
    Wang, Hao
    Mu, Xuejiao
    Wang, Dong
    Xu, Qiang
    Li, Kaiju
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (24): : 39276 - 39294
  • [10] I Know Your Triggers: Defending Against Textual Backdoor Attacks with Benign Backdoor Augmentation
    Gao, Yue
    Stokes, Jack W.
    Prasad, Manoj Ajith
    Marshall, Andrew T.
    Fawaz, Kassem
    Kiciman, Emre
    2022 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM), 2022,