Robust shortcut and disordered robustness: Improving adversarial training through adaptive smoothing

被引:0
|
作者
Li, Lin [1 ]
Spratling, Michael [1 ,2 ]
机构
[1] Kings Coll London, Dept Informat, London WC2B 4BG, England
[2] Univ Luxembourg, Dept Behav & Cognit Sci, L-4366 Esch Belval, Luxembourg
关键词
Adversarial robustness; Adversarial training; Loss smoothing; Instance adaptive;
D O I
10.1016/j.patcog.2025.111474
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks are highly susceptible to adversarial perturbations: artificial noise that corrupts input data in ways imperceptible to humans but causes incorrect predictions. Among the various defenses against these attacks, adversarial training has emerged as the most effective. In this work, we aim to enhance adversarial training to improve robustness against adversarial attacks. We begin by analyzing how adversarial vulnerability evolves during training from an instance-wise perspective. This analysis reveals two previously unrecognized phenomena: robust shortcut and disordered robustness. We then demonstrate that these phenomena are related to robust overfitting, a well-known issue in adversarial training. Building on these insights, we propose a novel adversarial training method: Instance-adaptive Smoothness Enhanced Adversarial Training (ISEAT). This method jointly smooths the input and weight loss landscapes in an instance-adaptive manner, preventing the exploitation of robust shortcut and thereby mitigating robust overfitting. Extensive experiments demonstrate the efficacy of ISEAT and its superiority over existing adversarial training methods. Code is available at https://github.com/TreeLLi/ISEAT.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Weighted Adaptive Perturbations Adversarial Training for Improving Robustness
    Wang, Yan
    Zhang, Dongmei
    Zhang, Haiyang
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT II, 2022, 13630 : 402 - 415
  • [2] Improving the Robustness of the Bug Triage Model through Adversarial Training
    Kim, Min-ha
    Wang, Dae-sung
    Wang, Sheng-tsai
    Park, Seo-Hyeon
    Lee, Chan-gun
    36TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2022), 2022, : 478 - 481
  • [3] Sliced Wasserstein adversarial training for improving adversarial robustness
    Lee W.
    Lee S.
    Kim H.
    Lee J.
    Journal of Ambient Intelligence and Humanized Computing, 2024, 15 (08) : 3229 - 3242
  • [4] Improving the robustness and accuracy of biomedical language models through adversarial training
    Moradi, Milad
    Samwald, Matthias
    JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 132
  • [5] Exploring Robust Features for Improving Adversarial Robustness
    Wang, Hong
    Deng, Yuefan
    Yoo, Shinjae
    Lin, Yuewei
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (09) : 5141 - 5151
  • [6] Robust Proxy: Improving Adversarial Robustness by Robust Proxy Learning
    Lee, Hong Joo
    Ro, Yong Man
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 4021 - 4033
  • [7] Enhancing Adversarial Robustness through Stable Adversarial Training
    Yan, Kun
    Yang, Luyi
    Yang, Zhanpeng
    Ren, Wenjuan
    SYMMETRY-BASEL, 2024, 16 (10):
  • [8] Improving Calibration through the Relationship with Adversarial Robustness
    Qin, Yao
    Wang, Xuezhi
    Beutel, Alex
    Chi, Ed H.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [9] Improving Robustness of Jet Tagging Algorithms with Adversarial Training
    Stein A.
    Coubez X.
    Mondal S.
    Novak A.
    Schmidt A.
    Computing and Software for Big Science, 2022, 6 (1)
  • [10] Improving Single-Step Adversarial Training By Local Smoothing
    Wang, Shaopeng
    Huang, Yanhong
    Shi, Jianqi
    Yang, Yang
    Guo, Xin
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,