Revisiting Adversarial Robustness Distillation from the Perspective of Robust Fairness

被引:0
|
作者
Yue, Xinli [1 ]
Mou, Ningping [1 ]
Wang, Qian [1 ]
Zhao, Lingchen [1 ]
机构
[1] Wuhan Univ, Sch Cyber Sci & Engn, Key Lab Aerosp Informat Secur & Trusted Comp, Minist Educ, Wuhan 430072, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Adversarial Robustness Distillation (ARD) aims to transfer the robustness of large teacher models to small student models, facilitating the attainment of robust performance on resource-limited devices. However, existing research on ARD primarily focuses on the overall robustness of student models, overlooking the crucial aspect of robust fairness. Specifically, these models may demonstrate strong robustness on some classes of data while exhibiting high vulnerability on other classes. Unfortunately, the "buckets effect" implies that the robustness of the deployed model depends on the classes with the lowest level of robustness. In this paper, we first investigate the inheritance of robust fairness during ARD and reveal that student models only partially inherit robust fairness from teacher models. We further validate this issue through fine-grained experiments with various model capacities and find that it may arise due to the gap in capacity between teacher and student models, as well as the existing methods treating each class equally during distillation. Based on these observations, we propose Fair Adversarial Robustness Distillation (Fair-ARD), a novel framework for enhancing the robust fairness of student models by increasing the weights of difficult classes, and design a geometric perspective-based method to quantify the difficulty of different classes for determining the weights. Extensive experiments show that Fair-ARD surpasses both state-of-the-art ARD methods and existing robust fairness algorithms in terms of robust fairness (e.g., the worst-class robustness under AutoAttack is improved by at most 12.3% and 5.3% using ResNet18 on CIFAR10, respectively), while also slightly improving overall robustness. Our code is available at: https://github.com/NISP-official/Fair-ARD.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Robust Distillation via Untargeted and Targeted Intermediate Adversarial Samples
    Dong, Junhao
    Koniusz, Piotr
    Chen, Junxi
    Wang, Z. Jane
    Ong, Yew-Soon
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 28432 - 28442
  • [42] Adversarial Robust Deep Reinforcement Learning Requires Redefining Robustness
    Korkmaz, Ezgi
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 7, 2023, : 8369 - 8377
  • [43] An interpretable adversarial robustness evaluation method based on robust paths
    Li, Zituo
    Sun, Jianbin
    Yao, Xuemei
    Cui, Ruijing
    Ge, Bingfeng
    Yang, Kewei
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 1213 - 1218
  • [44] Improving adversarial robustness using knowledge distillation guided by attention information bottleneck
    Gong, Yuxin
    Wang, Shen
    Yu, Tingyue
    Jiang, Xunzhi
    Sun, Fanghui
    INFORMATION SCIENCES, 2024, 665
  • [45] Out of Thin Air: Exploring Data-Free Adversarial Robustness Distillation
    Wang, Yuzheng
    Chen, Zhaoyu
    Yang, Dingkang
    Guo, Pinxue
    Jiang, Kaixun
    Zhang, Wenqiang
    Qi, Lizhe
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5776 - 5784
  • [46] Exploring Privacy and Fairness Risks in Sharing Diffusion Models: An Adversarial Perspective
    Luo, Xinjian
    Jiang, Yangfan
    Wei, Fei
    Wu, Yuncheng
    Xiao, Xiaokui
    Ooi, Beng Chin
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 8109 - 8124
  • [47] Transferable Adversarial Robustness for Categorical Data via Universal Robust Embeddings
    Kireev, Klim
    Andriushchenko, Maksym
    Troncoso, Carmela
    Flammarion, Nicolas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [48] Robustness-via-synthesis: Robust training with generative adversarial perturbations
    Baytas, Inci M.
    Deb, Debayan
    NEUROCOMPUTING, 2023, 516 : 49 - 60
  • [49] A knowledge distillation strategy for enhancing the adversarial robustness of lightweight automatic modulation classification models
    Xu, Fanghao
    Wang, Chao
    Liang, Jiakai
    Zuo, Chenyang
    Yue, Keqiang
    Li, Wenjun
    IET COMMUNICATIONS, 2024, 18 (14) : 827 - 845
  • [50] Robustness of classifiers: from adversarial to random noise
    Fawzi, Alhussein
    Moosayi-Dezfooli, Seyed-Mohsen
    Frossard, Pascal
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29