CoCoFuzzing: Testing Neural Code Models With Coverage-Guided Fuzzing

被引:4
|
作者
Wei, Moshi [1 ]
Huang, Yuchao [2 ]
Yang, Jinqiu [3 ]
Wang, Junjie [2 ]
Wang, Song [1 ]
机构
[1] York Univ, Toronto, ON M3J 1P3, Canada
[2] Chinese Acad Sci, Inst Software, Beijing 100045, Peoples R China
[3] Concordia Univ, Montreal, PQ H3G 1M8, Canada
关键词
Codes; Testing; Fuzzing; Neurons; Software; Biological neural networks; Task analysis; Code model; deep learning (DL); fuzzy logic; language model; robustness;
D O I
10.1109/TR.2022.3208239
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep learning (DL)-based code processing models have demonstrated good performance for tasks such as method name prediction, program summarization, and comment generation. However, despite the tremendous advancements, DL models are frequently susceptible to adversarial attacks, which pose a significant threat to the robustness and generalizability of these models by causing them to misclassify unexpected inputs. To address the issue above, numerous DL testing approaches have been proposed; however, these approaches primarily target testing DL applications in the domains of image, audio, and text analysis, etc., and cannot be "directly applied" to "neural models for code" due to the unique properties of programs. In this article, we propose a coverage-based fuzzing framework, CoCoFuzzing, for testing DL-based code processing models. In particular, we first propose 10 mutation operators to automatically generate validly and semantically preserving source code examples as tests, followed by a neuron coverage (NC)-based approach for guiding the generation of tests. The performance of CoCoFuzzing is evaluated using three state-of-the-art neural code models, i.e., NeuralCodeSum, CODE2SEQ, and CODE2VEC. Our experiment results indicate that CoCoFuzzing can generate validly and semantically preserving source code examples for testing the robustness and generalizability of these models and enhancing NC. Furthermore, these tests can be used for adversarial retraining to improve the performance of neural code models.
引用
收藏
页码:1276 / 1289
页数:14
相关论文
共 50 条
  • [41] Same Coverage, Less Bloat: Accelerating Binary-only Fuzzing with Coverage-preserving Coverage-guided Tracing
    Nagy, Stefan
    Anh Nguyen-Tuong
    Hiser, Jason D.
    Davidson, Jack W.
    Hicks, Matthew
    CCS '21: PROCEEDINGS OF THE 2021 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2021, : 351 - 365
  • [42] MalFuzz: Coverage-guided fuzzing on deep learning-based malware classification model
    Liu, Yuying
    Yang, Pin
    Jia, Peng
    He, Ziheng
    Luo, Hairu
    PLOS ONE, 2022, 17 (09):
  • [43] Just Fuzz It: Solving Floating-Point Constraints using Coverage-Guided Fuzzing
    Liew, Daniel
    Cadar, Cristian
    Donaldson, Alastair F.
    Stinnett, J. Ryan
    ESEC/FSE'2019: PROCEEDINGS OF THE 2019 27TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2019, : 521 - 532
  • [44] Fuzzing JavaScript Interpreters with Coverage-Guided Reinforcement Learning for LLM-Based Mutation
    Eom, Jueon
    Jeong, Seyeon
    Kwon, Taekyoung
    ISSTA 2024 - Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, : 1656 - 1668
  • [45] Testing Error Handling Code With Software Fault Injection and Error-Coverage-Guided Fuzzing
    Bai, Jia-Ju
    Fu, Zi-Xuan
    Xie, Kai-Tao
    Jiang, Zu-Ming
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2024, 21 (04) : 1724 - 1739
  • [46] Coverage-Guided Fuzz Testing for Cyber-Physical Systems
    Sheikhi, Sanaz
    Kim, Edward
    Duggirala, Parasara Sridhar
    Bak, Stanley
    2022 13TH ACM/IEEE INTERNATIONAL CONFERENCE ON CYBER-PHYSICAL SYSTEMS (ICCPS 2022), 2022, : 24 - 33
  • [47] CGFuzzer: A Fuzzing Approach Based on Coverage-Guided Generative Adversarial Networks for Industrial IoT Protocols
    Yu, Zhenhua
    Wang, Haolu
    Wang, Dan
    Li, Zhiwu
    Song, Houbing
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (21) : 21607 - 21619
  • [48] ARM-AFL: Coverage-Guided Fuzzing Framework for ARM-Based IoT Devices
    Fan, Rong
    Pan, Jianfeng
    Huang, Shaomang
    APPLIED CRYPTOGRAPHY AND NETWORK SECURITY WORKSHOPS, ACNS 2020, 2020, 12418 : 239 - 254
  • [49] COMET: Coverage-guided Model Generation For Deep Learning Library Testing
    Li, Meiziniu
    Cao, Jialun
    Tian, Yongqiang
    Li, Tsz On
    Wen, Ming
    Cheung, Shing-Chi
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2023, 32 (05)
  • [50] A Novel Coverage-Guided Greybox Fuzzing Method based on Grammar-Aware with Particle Swarm Optimization
    Wang, Shengran
    Chen, Jinfu
    Cai, Saihua
    Zhang, Chi
    Chen, Haibo
    2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY, AND SECURITY COMPANION, QRS-C, 2022, : 780 - 781