Enhanced automated code vulnerability repair using large language models

被引:2
|
作者
de-Fitero-Dominguez, David [1 ]
Garcia-Lopez, Eva [1 ]
Garcia-Cabot, Antonio [1 ]
Martinez-Herraiz, Jose-Javier [1 ]
机构
[1] Univ Alcala, Dept Ciencias Computac, Alcala De Henares 28805, Spain
关键词
Automated code repair; Deep learning; Large language models; Vulnerability repair; Mistral; Code llama;
D O I
10.1016/j.engappai.2024.109291
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This research addresses the complex challenge of automated repair of code vulnerabilities, vital for enhancing digital security in an increasingly technology-driven world. The study introduces a novel and efficient format for the representation of code modification, using advanced Large Language Models (LLMs) such as Code Llama and Mistral. These models, fine-tuned on datasets featuring C/C++ code vulnerabilities, significantly improve the accuracy and adaptability of automated code repair techniques. A key finding is the enhanced repair accuracy of these models when compared to previous methods such as VulRepair, which underscores their practical utility and efficiency. The research also offers a critical assessment of current evaluation metrics, such as "Perfect Predictions", and their limitations in reflecting the true capabilities of automated repair models in real-world scenarios. Following this, it underscores the importance of using test datasets devoid of train samples, emphasizing the need for dataset integrity to enhance the effectiveness of LLMs in code repair tasks. The significance of this work is its contribution to digital security, setting new standards for automated code vulnerability repair and paving the way for future advancements in the fields of cybersecurity and artificial intelligence. The study does not only highlight the potential of LLMs in enhancing code security but also fosters further exploration and research in these crucial areas.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categories and Test Code Repair
    Fatima, Sakina
    Hemmati, Hadi
    C. Briand, Lionel
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2024, 50 (12) : 3146 - 3171
  • [22] An Empirical Evaluation of Large Language Models in Static Code Analysis for PHP Vulnerability Detection
    Cetin, Orcun
    Ekmekcioglu, Emre
    Arief, Budi
    Hernandez-Castro, Julio
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2024, 30 (09) : 1163 - 1183
  • [23] Exploring the Potential of Pre-Trained Language Models of Code for Automated Program Repair
    Hao, Sichong
    Shi, Xianjun
    Liu, Hongwei
    ELECTRONICS, 2024, 13 (07)
  • [24] Investigating large language models capabilities for automatic code repair in Python']Python
    Omari, Safwan
    Basnet, Kshitiz
    Wardat, Mohammad
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (08): : 10717 - 10731
  • [25] KARGEN: Knowledge-Enhanced Automated Radiology Report Generation Using Large Language Models
    Li, Yingshu
    Wang, Zhanyu
    Liu, Yunyi
    Wang, Lei
    Liu, Lingqiao
    Zhou, Luping
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT V, 2024, 15005 : 382 - 392
  • [26] A Novel Approach for Automated Program Repair using Round-Trip Translation with Large Language Models
    Ruiz, Fernando Vallecillos
    Grishina, Anastasiia
    Hort, Max
    Moonen, Leon
    arXiv,
  • [27] Code Detection for Hardware Acceleration Using Large Language Models
    Martinez, Pablo Antonio
    Bernabe, Gregorio
    Garcia, Jose Manuel
    IEEE ACCESS, 2024, 12 : 35271 - 35281
  • [28] Repairing Infrastructure-as-Code using Large Language Models
    Low, En
    Cheh, Carmen
    Chen, Binbin
    2024 IEEE SECURE DEVELOPMENT CONFERENCE, SECDEV 2024, 2024, : 20 - 27
  • [29] Automated Program Repair in the Era of Large Pre-trained Language Models
    Xia, Chunqiu Steven
    Wei, Yuxiang
    Zhang, Lingming
    2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 1482 - 1494
  • [30] Automated Grading in Coding Exercises Using Large Language Models
    Lagakis, Paraskevas
    Demetriadis, Stavros
    Psathas, Georgios
    SMART MOBILE COMMUNICATION & ARTIFICIAL INTELLIGENCE, VOL 1, IMCL 2023, 2024, 936 : 363 - 373