Enhanced automated code vulnerability repair using large language models

被引：2

作者：

de-Fitero-Dominguez, David ^{[1
]}

Garcia-Lopez, Eva ^{[1
]}

Garcia-Cabot, Antonio ^{[1
]}

Martinez-Herraiz, Jose-Javier ^{[1
]}

机构：

[1] Univ Alcala, Dept Ciencias Computac, Alcala De Henares 28805, Spain

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2024年 / 138卷

关键词：

Automated code repair; Deep learning; Large language models; Vulnerability repair; Mistral; Code llama;

D O I：

10.1016/j.engappai.2024.109291

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This research addresses the complex challenge of automated repair of code vulnerabilities, vital for enhancing digital security in an increasingly technology-driven world. The study introduces a novel and efficient format for the representation of code modification, using advanced Large Language Models (LLMs) such as Code Llama and Mistral. These models, fine-tuned on datasets featuring C/C++ code vulnerabilities, significantly improve the accuracy and adaptability of automated code repair techniques. A key finding is the enhanced repair accuracy of these models when compared to previous methods such as VulRepair, which underscores their practical utility and efficiency. The research also offers a critical assessment of current evaluation metrics, such as "Perfect Predictions", and their limitations in reflecting the true capabilities of automated repair models in real-world scenarios. Following this, it underscores the importance of using test datasets devoid of train samples, emphasizing the need for dataset integrity to enhance the effectiveness of LLMs in code repair tasks. The significance of this work is its contribution to digital security, setting new standards for automated code vulnerability repair and paving the way for future advancements in the fields of cybersecurity and artificial intelligence. The study does not only highlight the potential of LLMs in enhancing code security but also fosters further exploration and research in these crucial areas.

引用

页数：13

共 50 条

[21] FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categories and Test Code Repair
Fatima, Sakina
Hemmati, Hadi
C. Briand, Lionel
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2024, 50 (12) : 3146 - 3171
[22] An Empirical Evaluation of Large Language Models in Static Code Analysis for PHP Vulnerability Detection
Cetin, Orcun
Ekmekcioglu, Emre
Arief, Budi
Hernandez-Castro, Julio
JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2024, 30 (09) : 1163 - 1183
[23] Exploring the Potential of Pre-Trained Language Models of Code for Automated Program Repair
Hao, Sichong
Shi, Xianjun
Liu, Hongwei
ELECTRONICS, 2024, 13 (07)
[24] Investigating large language models capabilities for automatic code repair in Python']Python
Omari, Safwan
Basnet, Kshitiz
Wardat, Mohammad
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (08): : 10717 - 10731
[25] KARGEN: Knowledge-Enhanced Automated Radiology Report Generation Using Large Language Models
Li, Yingshu
Wang, Zhanyu
Liu, Yunyi
Wang, Lei
Liu, Lingqiao
Zhou, Luping
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT V, 2024, 15005 : 382 - 392
[26] A Novel Approach for Automated Program Repair using Round-Trip Translation with Large Language Models
Ruiz, Fernando Vallecillos
Grishina, Anastasiia
Hort, Max
Moonen, Leon
arXiv,
[27] Code Detection for Hardware Acceleration Using Large Language Models
Martinez, Pablo Antonio
Bernabe, Gregorio
Garcia, Jose Manuel
IEEE ACCESS, 2024, 12 : 35271 - 35281
[28] Repairing Infrastructure-as-Code using Large Language Models
Low, En
Cheh, Carmen
Chen, Binbin
2024 IEEE SECURE DEVELOPMENT CONFERENCE, SECDEV 2024, 2024, : 20 - 27
[29] Automated Program Repair in the Era of Large Pre-trained Language Models
Xia, Chunqiu Steven
Wei, Yuxiang
Zhang, Lingming
2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 1482 - 1494
[30] Automated Grading in Coding Exercises Using Large Language Models
Lagakis, Paraskevas
Demetriadis, Stavros
Psathas, Georgios
SMART MOBILE COMMUNICATION & ARTIFICIAL INTELLIGENCE, VOL 1, IMCL 2023, 2024, 936 : 363 - 373

← 1 2 3 4 5 →