Who Wrote this Code? Watermarking for Code Generation

被引:0
|
作者
Lee, Taehyun [1 ]
Hong, Seokhee [1 ,3 ]
Ahn, Jaewoo [1 ]
Hong, Ilgee [1 ,4 ]
Lee, Hwaran [2 ]
Yun, Sangdoo [1 ,2 ]
Shin, Jamin [2 ]
Kim, Gunhee [1 ]
机构
[1] Seoul Natl Univ, Seoul, South Korea
[2] NAVER AI Lab, Grenoble, France
[3] LG AI Res, Seoul, South Korea
[4] Georgia Inst Technol, Atlanta, GA USA
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Since the remarkable generation performance of large language models raised ethical and legal concerns, approaches to detect machine-generated text by embedding watermarks are being developed. However, we discover that the existing works fail to function appropriately in code generation tasks due to the task's nature of having low entropy. Extending a logit-modifying watermark method, we propose Selective WatErmarking via Entropy Thresholding ( SWEET), which enhances detection ability and mitigates code quality degeneration by removing low-entropy segments at generating and detecting watermarks. Our experiments show that SWEET significantly improves code quality preservation while outperforming all baselines, including post-hoc detection methods, in detecting machine-generated code text. Our code is available in https://github.com/hongcheki/sweet-watermark.
引用
收藏
页码:4890 / 4911
页数:22
相关论文
共 50 条
  • [31] Ternary Code of Melody and Reliable Audio Watermarking
    Absalyamova, Karina S.
    Latypov, Rustam Kh
    Stolov, Evgeni L.
    2019 27TH TELECOMMUNICATIONS FORUM (TELFOR 2019), 2019, : 524 - 527
  • [32] The WHO Code and US Nursing
    Buchan, James
    Shaffer, Franklin A.
    NURSE LEADER, 2015, 13 (05) : 26 - 30
  • [33] WHO CARES ABOUT THE CODE
    MOON, E
    LIBRARY JOURNAL, 1961, 86 (09) : 1744 - 1745
  • [34] WHO NEEDS CODE ANYWAY?
    Lamb, Hilary
    Engineering and Technology, 2023, 18 (01): : 40 - 43
  • [35] WHO preparing code on bioethics
    不详
    ANNALS OF ONCOLOGY, 1999, 10 (06) : 614 - 614
  • [36] Who Can Maintain This Code?
    Avelino, Guilherme
    Passos, Leonardo
    Petrillo, Fabio
    Valente, Marco Tulio
    IEEE SOFTWARE, 2019, 36 (06) : 34 - 42
  • [37] Backward Propagation of Code Refinements on Transformational Code Generation Environments
    Guana, Victor
    Stroulia, Eleni
    2013 7TH INTERNATIONAL WORKSHOP ON TRACEABILITY IN EMERGING FORMS OF SOFTWARE ENGINEERING (TEFSE), 2013, : 55 - 60
  • [38] How the Cheyenne Indians wrote article 2 of the uniform commercial code
    Papke, DR
    BUFFALO LAW REVIEW, 1999, 47 (03): : 1457 - 1485
  • [39] TARGET CODE GENERATION FROM G-MACHINE CODE
    JOHNSSON, T
    LECTURE NOTES IN COMPUTER SCIENCE, 1987, 279 : 119 - 159
  • [40] Learn to Code Sustainably: An Empirical Study on Green Code Generation
    Vartziotis, Tina
    Dellatolas, Ippolyti
    Dasoulas, George
    Schmidt, Maximilian
    Schneider, Florian
    Hoffmann, Tim
    Kotsopoulos, Sotirios
    Keckeisen, Michael
    2024 INTERNATIONAL WORKSHOP ON LARGE LANGUAGE MODELS FOR CODE, LLM4CODE 2024, 2024, : 30 - 37