CoCoFuzzing: Testing Neural Code Models With Coverage-Guided Fuzzing

被引：4

作者：

Wei, Moshi ^{[1
]}

Huang, Yuchao ^{[2
]}

Yang, Jinqiu ^{[3
]}

Wang, Junjie ^{[2
]}

Wang, Song ^{[1
]}

机构：

[1] York Univ, Toronto, ON M3J 1P3, Canada

[2] Chinese Acad Sci, Inst Software, Beijing 100045, Peoples R China

[3] Concordia Univ, Montreal, PQ H3G 1M8, Canada

来源：

IEEE TRANSACTIONS ON RELIABILITY | 2023年 / 72卷 / 03期

关键词：

Codes; Testing; Fuzzing; Neurons; Software; Biological neural networks; Task analysis; Code model; deep learning (DL); fuzzy logic; language model; robustness;

D O I：

10.1109/TR.2022.3208239

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep learning (DL)-based code processing models have demonstrated good performance for tasks such as method name prediction, program summarization, and comment generation. However, despite the tremendous advancements, DL models are frequently susceptible to adversarial attacks, which pose a significant threat to the robustness and generalizability of these models by causing them to misclassify unexpected inputs. To address the issue above, numerous DL testing approaches have been proposed; however, these approaches primarily target testing DL applications in the domains of image, audio, and text analysis, etc., and cannot be "directly applied" to "neural models for code" due to the unique properties of programs. In this article, we propose a coverage-based fuzzing framework, CoCoFuzzing, for testing DL-based code processing models. In particular, we first propose 10 mutation operators to automatically generate validly and semantically preserving source code examples as tests, followed by a neuron coverage (NC)-based approach for guiding the generation of tests. The performance of CoCoFuzzing is evaluated using three state-of-the-art neural code models, i.e., NeuralCodeSum, CODE2SEQ, and CODE2VEC. Our experiment results indicate that CoCoFuzzing can generate validly and semantically preserving source code examples for testing the robustness and generalizability of these models and enhancing NC. Furthermore, these tests can be used for adversarial retraining to improve the performance of neural code models.

引用

页码：1276 / 1289

页数：14

共 50 条

[31] Graphuzz: Data-driven Seed Scheduling for Coverage-guided Greybox Fuzzing
Xu, Hang
Chen, Liheng
Gan, Shuitao
Zhang, Chao
Li, Zheming
Ji, Jiangan
Chen, Baojian
Hu, Fan
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (07)
[32] Investigating Coverage Guided Fuzzing with Mutation Testing
Qian, Ruixiang
Zhang, Quanjun
Fang, Chunrong
Guo, Lihua
13TH ASIA-PACIFIC SYMPOSIUM ON INTERNETWARE, INTERNETWARE 2022, 2022, : 272 - 281
[33] NDFuzz: a non-intrusive coverage-guided fuzzing framework for virtualized network devices
Zhang, Yu
Zhong, Nanyu
You, Wei
Zou, Yanyan
Jian, Kunpeng
Xu, Jiahuan
Sun, Jian
Liu, Baoxu
Huo, Wei
CYBERSECURITY, 2022, 5 (01)
[34] Bita: Coverage-Guided, Automatic Testing of Actor Programs
Tasharofi, Samira
Pradel, Michael
Lin, Yu
Johnson, Ralph
2013 28TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), 2013, : 114 - 124
[35] Automated SC-MCC test case generation using coverage-guided fuzzing
Golla, Monika Rani
Godboley, Sangharatna
SOFTWARE QUALITY JOURNAL, 2024, 32 (03) : 849 - 880
[36] DeepRanger: Coverage-guided Deep Forest Testing Approach
Cui Z.-Q.
Xie R.-L.
Chen X.
Liu X.-L.
Zheng L.-W.
Ruan Jian Xue Bao/Journal of Software, 2023, 34 (05): : 2251 - 2267
[37] Alphuzz: Monte Carlo Search on Seed-Mutation Tree for Coverage-Guided Fuzzing
Zhao, Yiru
Wang, Xiaoke
Zhao, Lei
Cheng, Yueqiang
Yin, Heng
PROCEEDINGS OF THE 38TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE, ACSAC 2022, 2022, : 534 - 547
[38] A Novel Coverage-guided Greybox Fuzzing based on Power Schedule Optimization with Time Complexity
Chen, Jinfu
Wang, Shengran
Cai, Saihua
Zhang, Chi
Chen, Haibo
Chen, Jingyi
Zhang, Jianming
PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,
[39] CatchFuzz: Reliable active anti-fuzzing techniques against coverage-guided fuzzer
Kim, Hee Yeon
Lee, Dong Hoon
COMPUTERS & SECURITY, 2024, 143
[40] NDFuzz: a non-intrusive coverage-guided fuzzing framework for virtualized network devices
Yu Zhang
Nanyu Zhong
Wei You
Yanyan Zou
Kunpeng Jian
Jiahuan Xu
Jian Sun
Baoxu Liu
Wei Huo
Cybersecurity, 5

← 1 2 3 4 5 →