Adversarial Feature Desensitization

被引:0
|
作者
Bashivan, Pouya [1 ,2 ]
Bayat, Reza [2 ]
Ibrahim, Adam [2 ]
Ahuja, Kartik [2 ]
Faramarzi, Mojtaba [2 ]
Laleh, Touraj [2 ]
Richards, Blake [1 ,2 ]
Rish, Irina [2 ]
机构
[1] McGill Univ, Montreal, PQ, Canada
[2] Univ Montreal, MILA, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural networks are known to be vulnerable to adversarial attacks - slight but carefully constructed perturbations of the inputs which can drastically impair the network's performance. Many defense methods have been proposed for improving robustness of deep networks by training them on adversarially perturbed inputs. However, these models often remain vulnerable to new types of attacks not seen during training, and even to slightly stronger versions of previously seen attacks. In this work, we propose a novel approach to adversarial robustness, which builds upon the insights from the domain adaptation field. Our method, called Adversarial Feature Desensitization (AFD), aims at learning features that are invariant towards adversarial perturbations of the inputs. This is achieved through a game where we learn features that are both predictive and robust (insensitive to adversarial attacks), i.e. cannot be used to discriminate between natural and adversarial data. Empirical results on several benchmarks demonstrate the effectiveness of the proposed approach against a wide range of attack types and attack strengths. Our code is available at https://github.com/BashivanLab/afd.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Robustness -enhanced Uplift Modeling with Adversarial Feature Desensitization
    Sun, Zexu
    He, Bowei
    Ma, Ming
    Tang, Jiakai
    Wang, Yuchen
    Ma, Chen
    Liu, Dugang
    23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, ICDM 2023, 2023, : 1325 - 1330
  • [2] Adversarial Feature Selection
    Budhraja, Karan K.
    Oates, Tim
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2015, : 288 - 294
  • [3] Visual Feature Attribution Based on Adversarial Feature Pairs
    Zhang X.
    Shi C.
    Li X.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2020, 57 (03): : 604 - 615
  • [4] Adversarial anchor-guided feature refinement for adversarial defense
    Lee, Hakmin
    Ro, Yong Man
    IMAGE AND VISION COMPUTING, 2023, 136
  • [5] Domain Generalization with Adversarial Feature Learning
    Li, Haoliang
    Pan, Sinno Jialin
    Wang, Shiqi
    Kot, Alex C.
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5400 - 5409
  • [6] Adversarial Training Based Feature Selection
    Liu, Binghui
    Han, Keji
    Hang, Jie
    Li, Yun
    SCIENCE OF CYBER SECURITY, SCISEC 2019, 2019, 11933 : 92 - 105
  • [7] Feature autoencoder for detecting adversarial examples
    Ye, Hongwei
    Liu, Xiaozhang
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (10) : 7459 - 7477
  • [8] Sparse Feature Attacks in Adversarial Learning
    Yin, Zhizhou
    Wang, Fei
    Liu, Wei
    Chawla, Sanjay
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (06) : 1164 - 1177
  • [9] Adversarial Feature Matching for Text Generation
    Zhang, Yizhe
    Gan, Zhe
    Fan, Kai
    Chen, Zhi
    Henao, Ricardo
    Shen, Dinghan
    Carin, Lawrence
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [10] Feature Separation and Recalibration for Adversarial Robustness
    Kim, Woo Jac
    Cho, Yoonki
    Jung, Junsik
    Yoon, Sung-Eui
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 8183 - 8192