Large Language Models for Code Obfuscation Evaluation of the Obfuscation Capabilities of OpenAI's GPT-3.5 on C Source Code

被引:0
|
作者
Kochberger, Patrick [1 ,2 ]
Gramberger, Maximilian [1 ]
Schrittwieser, Sebastian [2 ]
Lawitschka, Caroline [2 ]
Weippl, Edgar R. [3 ]
机构
[1] St Polten Univ Appl Sci, Inst IT Secur Res, St Polten, Austria
[2] Univ Vienna, Res Grp Secur & Privacy, Vienna, Austria
[3] SBA Res, Vienna, Austria
基金
奥地利科学基金会;
关键词
Software Protections; Code Obfuscation; Large Language Model; GPT;
D O I
10.5220/0012167000003555
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study explores the efficacy of large language models, specifically GPT-3.5, in obfuscating C source code for software protection. We utilized eight distinct obfuscation techniques in tandem with seven representative C code samples to conduct a comprehensive analysis. The evaluation was performed using a Python-based tool we developed, which interfaces with the OpenAI API to access GPT-3.5. Our metrics of evaluation included the correctness and diversity of the obfuscated code, along with the robustness of the resultant protection. While the diversity of the resulting code was found to be commendable, our findings indicate a prevalent issue with the correctness of the obfuscated code and the overall level of protection provided. Consequently, we assert that while promising, the feasibility of deploying large language models for automatic code obfuscation is not yet sufficiently established. This study signifies an important step towards understanding the limitations and potential of AI-based code obfuscation, thereby informing future research in this area.
引用
收藏
页码:7 / 19
页数:13
相关论文
共 39 条
  • [1] Evaluation of Large Language Models on Code Obfuscation (Student Abstract)
    Swindle, Adrian
    McNealy, Derrick
    Krishnan, Giri
    Ramyaa, Ramyaa
    THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23664 - 23666
  • [2] Language and Obfuscation Oblivious Source Code Authorship Attribution
    Zafar, Sarim
    Sarwar, Muhammad Usman
    Salem, Saeed
    Malik, Muhammad Zubair
    IEEE ACCESS, 2020, 8 (08): : 197581 - 197596
  • [3] Mechanisms for Source Code Obfuscation in C: Novel Techniques and Implementation
    Ahire, Pallavi
    Abraham, Jibi
    2020 INTERNATIONAL CONFERENCE ON EMERGING SMART COMPUTING AND INFORMATICS (ESCI), 2020, : 52 - 59
  • [4] Implementation of an Obfuscation Tool for C/C plus plus Source Code Protection on the XScale Architecture
    Cho, Seongje
    Chang, Hyeyoung
    Cho, Yookun
    SOFTWARE TECHNOLOGIES FOR EMBEDDED AND UBIQUITOUS SYSTEMS, PROCEEDINGS, 2008, 5287 : 406 - +
  • [6] Evaluating Large Language Models for the National Premedical Exam in India: Comparative Analysis of GPT-3.5, GPT-4, and Bard
    Farhat, Faiza
    Chaudhry, Beenish Moalla
    Nadeem, Mohammad
    Sohail, Shahab Saquib
    Madsen, Dag Oivind
    JMIR MEDICAL EDUCATION, 2024, 10
  • [7] Do Large Language Models Show Decision Heuristics Similar to Humans? A Case Study Using GPT-3.5
    Suri, Gaurav
    Slater, Lily R.
    Ziaee, Ali
    Nguyen, Morgan
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL, 2024, 153 (04) : 1066 - 1075
  • [8] Evaluating the GPT-3.5 and GPT-4 Large Language Models for Zero-Shot Classification of South African Violent Event Data
    Kotze, Eduan
    Senekal, Burgert A.
    2024 7TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, BIG DATA, COMPUTING AND DATA COMMUNICATION SYSTEMS, ICABCD 2024, 2024,
  • [9] Comparative Analysis of Large Language Models in Source Code Analysis
    Erdoğan, Hüseyin
    Turan, Nezihe Turhan
    Onan, Aytuğ
    Lecture Notes in Networks and Systems, 2024, 1088 LNNS : 185 - 192
  • [10] Comparative Analysis of Large Language Models in Source Code Analysis
    Erdogan, Huseyin
    Turan, Nezihe Turhan
    Onan, Aytug
    INTELLIGENT AND FUZZY SYSTEMS, INFUS 2024 CONFERENCE, VOL 1, 2024, 1088 : 185 - 192