A Practical Black-Box Attack on Source Code Authorship Identification Classifiers

被引:11
|
作者
Liu, Qianjun [1 ]
Ji, Shouling [1 ]
Liu, Changchang [2 ]
Wu, Chunming [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
[2] IBM Thomas J Watson Res Ctr, Dept Distributed AI, Yorktown Hts, NY 10598 USA
基金
中国国家自然科学基金;
关键词
Feature extraction; Tools; Training; Syntactics; Predictive models; Perturbation methods; Transforms; Source code; authorship identification; adversarial stylometry; ROBUSTNESS;
D O I
10.1109/TIFS.2021.3080507
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Existing researches have recently shown that adversarial stylometry of source code can confuse source code authorship identification (SCAI) models, which may threaten the security of related applications such as programmer attribution, software forensics, etc. In this work, we propose source code authorship disguise (SCAD) to automatically hide programmers' identities from authorship identification, which is more practical than the previous work that requires to known the output probabilities or internal details of the target SCAI model. Specifically, SCAD trains a substitute model and develops a set of semantically equivalent transformations, based on which the original code is modified towards a disguised style with small manipulations in lexical features and syntactic features. When evaluated under totally black-box settings, on a real-world dataset consisting of 1,600 programmers, SCAD induces state-of-the-art SCAI models to cause above 30% misclassification rates. The efficiency and utility-preserving properties of SCAD are also demonstrated with multiple metrics. Furthermore, our work can serve as a guideline for developing more robust identification methods in the future.
引用
收藏
页码:3620 / 3633
页数:14
相关论文
共 50 条
  • [31] Black-Box Boundary Attack Based on Gradient Optimization
    Yang, Yuli
    Liu, Zishuo
    Lei, Zhen
    Wu, Shuhong
    Chen, Yongle
    ELECTRONICS, 2024, 13 (06)
  • [32] Boosting Black-box Adversarial Attack with a Better Convergence
    Yin, Heng
    Wang, Jindong
    Mi, Yan
    Zhang, Xiaoning
    2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 1234 - 1238
  • [33] Sparse Black-Box Video Attack with Reinforcement Learning
    Xingxing Wei
    Huanqian Yan
    Bo Li
    International Journal of Computer Vision, 2022, 130 : 1459 - 1473
  • [34] Sparse Black-Box Video Attack with Reinforcement Learning
    Wei, Xingxing
    Yan, Huanqian
    Li, Bo
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (06) : 1459 - 1473
  • [35] Projection & Probability-Driven Black-Box Attack
    Li, Jie
    Li, Rongrong
    Liu, Hong
    Liu, Jianzhuang
    Zhong, Bineng
    Deng, Cheng
    Tian, Qi
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 359 - 368
  • [36] An Effective Way to Boost Black-Box Adversarial Attack
    Feng, Xinjie
    Yao, Hongxun
    Che, Wenbin
    Zhang, Shengping
    MULTIMEDIA MODELING (MMM 2020), PT I, 2020, 11961 : 393 - 404
  • [37] Generalizable Black-Box Adversarial Attack With Meta Learning
    Yin, Fei
    Zhang, Yong
    Wu, Baoyuan
    Feng, Yan
    Zhang, Jingyi
    Fan, Yanbo
    Yang, Yujiu
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (03) : 1804 - 1818
  • [38] Black-box Bayesian adversarial attack with transferable priors
    Zhang, Shudong
    Gao, Haichang
    Shu, Chao
    Cao, Xiwen
    Zhou, Yunyi
    He, Jianping
    MACHINE LEARNING, 2024, 113 (04) : 1511 - 1528
  • [39] A black-box adversarial attack on demand side management
    Cramer, Eike
    Gao, Ji
    COMPUTERS & CHEMICAL ENGINEERING, 2024, 186
  • [40] Adaptive hyperparameter optimization for black-box adversarial attack
    Zhenyu Guan
    Lixin Zhang
    Bohan Huang
    Bihe Zhao
    Song Bian
    International Journal of Information Security, 2023, 22 : 1765 - 1779