Who Wrote this Code? Watermarking for Code Generation

被引:0
|
作者
Lee, Taehyun [1 ]
Hong, Seokhee [1 ,3 ]
Ahn, Jaewoo [1 ]
Hong, Ilgee [1 ,4 ]
Lee, Hwaran [2 ]
Yun, Sangdoo [1 ,2 ]
Shin, Jamin [2 ]
Kim, Gunhee [1 ]
机构
[1] Seoul Natl Univ, Seoul, South Korea
[2] NAVER AI Lab, Grenoble, France
[3] LG AI Res, Seoul, South Korea
[4] Georgia Inst Technol, Atlanta, GA USA
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Since the remarkable generation performance of large language models raised ethical and legal concerns, approaches to detect machine-generated text by embedding watermarks are being developed. However, we discover that the existing works fail to function appropriately in code generation tasks due to the task's nature of having low entropy. Extending a logit-modifying watermark method, we propose Selective WatErmarking via Entropy Thresholding ( SWEET), which enhances detection ability and mitigates code quality degeneration by removing low-entropy segments at generating and detecting watermarks. Our experiments show that SWEET significantly improves code quality preservation while outperforming all baselines, including post-hoc detection methods, in detecting machine-generated code text. Our code is available in https://github.com/hongcheki/sweet-watermark.
引用
收藏
页码:4890 / 4911
页数:22
相关论文
共 50 条
  • [21] SrcMarker: Dual-Channel Source Code Watermarking via Scalable Code Transformations
    Yang, Borui
    Li, Wei
    Xiang, Liyao
    Li, Bo
    45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2024, 2024, : 4088 - 4106
  • [22] CodeScore: Evaluating Code Generation by Learning Code Execution
    Dong, Yihong
    Ding, Jiazheng
    Jiang, Xue
    Li, Ge
    Li, Zhuo
    Jin, Zhi
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2025, 34 (03)
  • [23] Long PN Code Based DSSS Watermarking
    Huang, Junwei
    Pan, Xian
    Fu, Xinwen
    Wang, Jie
    2011 PROCEEDINGS IEEE INFOCOM, 2011, : 2426 - 2434
  • [24] Audio Watermarking with Error-Correcting Code
    Yee, Htay Htay
    Wei, Foo Say
    TENCON 2009 - 2009 IEEE REGION 10 CONFERENCE, VOLS 1-4, 2009, : 1737 - 1741
  • [25] Digital Watermarking Based on a Gradient Orientation Code
    Kondo, Toshiaki
    Kamakura, Yoshiyuki
    2022 61ST ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS (SICE), 2022, : 370 - 375
  • [26] Code generation in Scicos
    Djenidi, R
    Nikoukhah, R
    Steer, S
    MODELLING AND SIMULATION 2001, 2001, : 306 - 313
  • [27] CodeWMBench: An Automated Benchmark for Code Watermarking Evaluation
    Wu, Benlong
    Chen, Kejiang
    He, Yanru
    Chen, Guoqiang
    Zhang, Weiming
    Yu, Nenghai
    PROCEEDINGS OF THE ACM TURING AWARD CELEBRATION CONFERENCE-CHINA 2024, ACM-TURC 2024, 2024, : 120 - 125
  • [28] GENERATION OF A PICTORIAL CODE
    SEYMOUR, PHK
    MEMORY & COGNITION, 1974, 2 (02) : 224 - 232
  • [29] CODE GENERATION BY COAGULATION
    KARR, M
    SIGPLAN NOTICES, 1984, 19 (06): : 1 - 12
  • [30] Smart contract watermarking based on code obfuscation
    Huang, Teng
    Huang, Jiahui
    Pang, Yan
    Yan, Hongyang
    INFORMATION SCIENCES, 2023, 628 : 439 - 448