Orthogonal Deep Models as Defense Against Black-Box Attacks

被引:5
|
作者
Jalwana, Mohammad A. A. K. [1 ]
Akhtar, Naveed [1 ]
Bennamoun, Mohammed [1 ]
Mian, Ajmal [1 ]
机构
[1] Univ Western Australia, Dept Comp Sci & Software Engn, Perth, WA 6009, Australia
基金
澳大利亚研究理事会;
关键词
Deep learning; adversarial examples; adversarial perturbations; orthogonal models; robust deep learning;
D O I
10.1109/ACCESS.2020.3005961
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep learning has demonstrated state-of-the-art performance for a variety of challenging computer vision tasks. On one hand, this has enabled deep visual models to pave the way for a plethora of critical applications like disease prognostics and smart surveillance. On the other, deep learning has also been found vulnerable to adversarial attacks, which calls for new techniques to defend deep models against these attacks. Among the attack algorithms, the black-box schemes are of serious practical concern since they only need publicly available knowledge of the targeted model. We carefully analyze the inherent weakness of deep models in black-box settings where the attacker may develop the attack using a model similar to the targeted model. Based on our analysis, we introduce a novel gradient regularization scheme that encourages the internal representation of a deep model to be orthogonal to another, even if the architectures of the two models are similar. Our unique constraint allows a model to concomitantly endeavour for higher accuracy while maintaining near orthogonal alignment of gradients with respect to a reference model. Detailed empirical study verifies that controlled misalignment of gradients under our orthogonality objective significantly boosts a model's robustness against transferable black-box adversarial attacks. In comparison to regular models, the orthogonal models are significantly more robust to a range of l(p) norm bounded perturbations. We verify the effectiveness of our technique on a variety of large-scale models.
引用
收藏
页码:119744 / 119757
页数:14
相关论文
共 50 条
  • [31] Simple Black-box Adversarial Attacks
    Guo, Chuan
    Gardner, Jacob R.
    You, Yurong
    Wilson, Andrew Gordon
    Weinberger, Kilian Q.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [32] Fortifying Machine Learning-Powered Intrusion Detection: A Defense Strategy Against Adversarial Black-Box Attacks
    Pujari, Medha
    Sun, Weiqing
    PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 5, ICICT 2024, 2024, 1000 : 655 - 671
  • [33] Black-box attacks against log anomaly detection with adversarial examples
    Lu, Siyang
    Wang, Mingquan
    Wang, Dongdong
    Wei, Xiang
    Xiao, Sizhe
    Wang, Zhiwei
    Han, Ningning
    Wang, Liqiang
    INFORMATION SCIENCES, 2023, 619 : 249 - 262
  • [34] Black-Box Adversarial Attacks Against SQL Injection Detection Model
    Alqhtani, Maha
    Alghazzawi, Daniyal
    Alarifi, Suaad
    CONTEMPORARY MATHEMATICS, 2024, 5 (04): : 5098 - 5112
  • [35] Improving query efficiency of black-box attacks via the preference of models
    Yang, Xiangyuan
    Lin, Jie
    Zhang, Hanlin
    Zhao, Peng
    INFORMATION SCIENCES, 2024, 678
  • [36] Black-Box Graph Backdoor Defense
    Yang, Xiao
    Li, Gaolei
    Tao, Xiaoyi
    Zhang, Chaofeng
    Li, Jianhua
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2023, PT V, 2024, 14491 : 163 - 180
  • [37] POISONING-FREE DEFENSE AGAINST BLACK-BOX MODEL EXTRACTION
    Zhang, Haitian
    Hua, Guang
    Yang, Wen
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 4760 - 4764
  • [38] Automatic Selection Attacks Framework for Hard Label Black-Box Models
    Liu, Xiaolei
    Li, Xiaoyu
    Zheng, Desheng
    Bai, Jiayu
    Peng, Yu
    Zhang, Shibin
    IEEE INFOCOM 2022 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2022,
  • [39] Query-efficient label-only attacks against black-box machine learning models
    Ren, Yizhi
    Zhou, Qi
    Wang, Zhen
    Wu, Ting
    Wu, Guohua
    Choo, Kim-Kwang Raymond
    COMPUTERS & SECURITY, 2020, 90
  • [40] Gradient-Leaks: Enabling Black-Box Membership Inference Attacks Against Machine Learning Models
    Liu, Gaoyang
    Xu, Tianlong
    Zhang, Rui
    Wang, Zixiong
    Wang, Chen
    Liu, Ling
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 427 - 440