Imperceptible Local Adversarial Attacks on Human Pose Estimation

被引:0
|
作者
Liu F. [1 ]
Wang H. [1 ]
Wang Y. [2 ]
Miao Y. [1 ]
机构
[1] School of Information Science and Technology, Hangzhou Normal University, Hangzhou
[2] College of Mechanical and Electrical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing
关键词
adversarial attack; human pose estimation; imperceptibility; local perturbation; white-box attack;
D O I
10.3724/SP.J.1089.2023.19765
中图分类号
学科分类号
摘要
Though deep neural networks have achieved state-of-the-art performance in many tasks, they have recently been shown to be unstable to slight adversarial perturbations of data samples. In the task of adversarial attack on human pose estimation, large perturbations are usually required to achieve an attack, which degrades the imperceptibility. If, on the other hand, small perturbations preserve the imperceptibility, which weakens the adversary’s attack effect. To solve this issue, this paper proposes a two-stage local adversarial attack method for human pose estimation. The proposed method first estimates critical perturbation regions by pre-attack, and then generates adversarial perturbations within each critical region under the imperceptibility constraint. The proposed method improves the attack success rate on human pose estimation and retains imperceptibility as well. We validate the effectiveness of our method on the COCO2017 dataset in terms of PCK metric and compare the results with existing methods including IGSM and C&W. Our proposed method outperforms existing methods, and improves the attack success rate by 15.4% and 2.8% respectively. The experiments show that our method achieves higher attack success rates while keeping the imperceptibility of the attack. © 2023 Institute of Computing Technology. All rights reserved.
引用
收藏
页码:1577 / 1587
页数:10
相关论文
共 27 条
  • [1] Deng J, Guo J, Yang J, Et al., Variational prototype learning for deep face recognition, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11906-11915, (2021)
  • [2] Graves A, Jaitly N., Towards end-to-end speech recognition with recurrent neural networks, Proceedings of the International Conference on Machine Learning, pp. 1764-1772, (2014)
  • [3] Pan X, Shi J, Luo P, Et al., Spatial as deep: spatial cnn for traffic scene understanding, Proceedings of the AAAI Conference on Artificial Intelligence, pp. 7276-7283, (2018)
  • [4] Ronneberger O, Fischer P, Brox T., U-net: convolutional networks for biomedical image segmentation, Proceedings of the International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 234-241, (2015)
  • [5] Goodfellow I J, Shlens J, Szegedy C., Explaining and harnessing adversarial examples, (2015)
  • [6] Su J, Vargas D V, Sakurai K., One pixel attack for fooling deep neural networks, IEEE Transactions on Evolutionary Computation, 23, 5, pp. 828-841, (2019)
  • [7] Jain N, Shah S, Kumar A, Et al., On the robustness of human pose estimation, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 29-38, (2019)
  • [8] Newell A, Yang K, Deng J., Stacked hourglass networks for human pose estimation, Proceedings of the European Conference on Computer Vision, pp. 483-499, (2016)
  • [9] Georgia G, Alexander T, Navdeep J., Chained predictions using convolutional neural networks, Proceedings of the European Conference on Computer Vision, pp. 728-743, (2016)
  • [10] Yang W, Li S, Ouyang W, Et al., Learning feature pyramids for human pose estimation, Proceedings of the IEEE International Conference on Computer Vision, pp. 1281-1290, (2017)