CodeWMBench: An Automated Benchmark for Code Watermarking Evaluation

被引：0

作者：

Wu, Benlong ^{[1
]}

Chen, Kejiang ^{[1
]}

He, Yanru ^{[1
]}

Chen, Guoqiang ^{[1
]}

Zhang, Weiming ^{[1
]}

Yu, Nenghai ^{[1
]}

机构：

[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China

来源：

PROCEEDINGS OF THE ACM TURING AWARD CELEBRATION CONFERENCE-CHINA 2024, ACM-TURC 2024 | 2024年

关键词：

Programming language model; code watermark; benchmark; SOFTWARE WATERMARKING;

D O I：

10.1145/3674399.3674447

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As deep learning progresses, programming language generation models such as CodeLlama, GitHub Copilot, and ChatGPT have been widely applied to intelligent code development. However, this also reduces the cost of code plagiarism, posing challenges to copyright and academic integrity. In response to the specific needs for human-machine code detection, this paper introduces a comprehensive automated benchmark CodeWMBench for active detection of human-machine code through watermarking. With a meticulous evaluation of eight code watermarking methods, we demonstrated their performance in terms of harmlessness, robustness, and transparency. Specifically, for the first time, we introduced watermark removal techniques based on large language models and conducted the first assessment of these watermarking methods against code rewriting and retranslating attacks. In the discussion, we delved into the critical issues currently facing code watermarking, including why existing code watermarking methods struggle to resist removal by large language models and potential future methods that could withstand such removals.

引用

页码：120 / 125

页数：6

共 50 条

[21] Scaling symbolic evaluation for automated verification of systems code with Serval
Nelson, Luke
Bornholt, James
Gu, Ronghui
Baumann, Andrew
Torlak, Emina
Wang, Xi
PROCEEDINGS OF THE TWENTY-SEVENTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES (SOSP '19), 2019, : 225 - 242
[22] Correlating Automated and Human Evaluation of Code Documentation Generation Quality
Hu, Xing
Chen, Qiuyuan
Wang, Haoye
Xia, Xin
Lo, David
Zimmermann, Thomas
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2022, 31 (04)
[23] An Automated Architectural Evaluation Approach Based on Metadata and Code Analysis
Pinto, Felipe
Kulesza, Uira
Guerra, Eduardo
ENTERPRISE INFORMATION SYSTEMS, ICEIS 2013, 2014, 190 : 490 - 505
[24] QR Code Watermarking for Digital Images
Chow, Yang-Wai
Susilo, Willy
Baek, Joonsang
Kim, Jongkil
INFORMATION SECURITY APPLICATIONS, WISA 2019, 2020, 11897 : 25 - 37
[25] Attack modelling: towards a second generation watermarking benchmark
Voloshynovskiy, S
Pereira, S
Iquise, V
Pun, T
SIGNAL PROCESSING, 2001, 81 (06) : 1177 - 1214
[26] Robust object watermarking: Application to code
Stern, JP
Hachez, G
Koeune, F
Quisquater, JJ
INFORMATION HIDING, PROCEEDINGS, 2000, 1768 : 368 - 378
[27] AN EVALUATION BENCHMARK FOR AUTOMATIC SPEECH RECOGNITION OF GERMAN-ENGLISH CODE-SWITCHING
Khosravani, Abbas
Garner, Philip N.
Lazaridis, Alexandros
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 811 - 816
[28] Scalable prime generator benchmark code
Češpiva, L.
Neural Network World, 2001, 11 (02) : 129 - 143
[29] Sound Quality Evaluation for Audio Watermarking Based on Phase Shift Keying Using BCH Code
Murata, Harumi
Ogihara, Akio
Uesaka, Masaki
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (01): : 89 - 94
[30] ComplexCodeEval: A Benchmark for Evaluating Large Code Models on More Complex Code
Feng, Jia
Liu, Jiachen
Gao, Cuiyun
Chong, Chun Yong
Wang, Chaozheng
Gao, Shan
Xia, Xin
arXiv, 2024,

← 1 2 3 4 5 →