The Patch Overfitting Problem in Automated Program Repair: Practical Magnitude and a Baseline for Realistic Benchmarking

被引：0

作者：

Petke, Justyna ^{[1
]}

Martinez, Matias ^{[2
]}

Kechagia, Maria ^{[1
]}

Aleti, Aldeida ^{[3
]}

Sarro, Federica ^{[1
]}

机构：

[1] UCL, London, England

[2] Univ Politecn Cataluna, BarcelonaTech, Barcelona, Spain

[3] Monash Univ, Melbourne, Vic, Australia

来源：

COMPANION PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, FSE COMPANION 2024 | 2024年

基金：

澳大利亚研究理事会; 英国工程与自然科学研究理事会;

关键词：

Overfitting; Automated Program Repair; Patch Assessment;

D O I：

10.1145/3663529.3663776

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Automated program repair techniques aim to generate patches for software bugs, mainly relying on testing to check their validity. The generation of a large number of such plausible yet incorrect patches is widely believed to hinder wider application of APR in practice, which has motivated research in automated patch assessment. We reflect on the validity of this motivation and carry out an empirical study to analyse the extent to which 10 APR tools suffer from the overfitting problem in practice. We observe that the number of plausible patches generated by any of the APR tools analysed for a given bug from the Defects4J dataset is remarkably low, a median of 2, indicating that a developer only needs to consider 2 patches in most cases to be confident to find a fix or confirming its nonexistence. This study unveils that the overfitting problem might not be as bad as previously thought. We reflect on current evaluation strategies of automated patch assessment techniques and propose a Random Selection baseline to assess whether and when using such techniques is beneficial for reducing human effort. We advocate future work should evaluate the benefit arising from patch overfitting assessment usage against the random baseline.

引用

页码：452 / 456

页数：5

共 23 条

[1] Overfitting in semantics-based automated program repair
Le, Xuan Bach D.
Thung, Ferdian
Lo, David
Le Goues, Claire
EMPIRICAL SOFTWARE ENGINEERING, 2018, 23 (05) : 3007 - 3033
[2] Overfitting in semantics-based automated program repair
Xuan Bach D. Le
Ferdian Thung
David Lo
Claire Le Goues
Empirical Software Engineering, 2018, 23 : 3007 - 3033
[3] Overfitting in Semantics-based Automated Program Repair
Le, Xuan-Bach D.
Thung, Ferdian
Lo, David
Le Goues, Claire
PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2018, : 163 - 163
[4] Is the Cure Worse Than the Disease? Overfitting in Automated Program Repair
Smith, Edward K.
Barr, Earl T.
Le Goues, Claire
Brun, Yuriy
2015 10TH JOINT MEETING OF THE EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND THE ACM SIGSOFT SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE 2015) PROCEEDINGS, 2015, : 532 - 543
[5] Automated patch assessment for program repair at scale
Ye, He
Martinez, Matias
Monperrus, Martin
EMPIRICAL SOFTWARE ENGINEERING, 2021, 26 (02)
[6] Adversarial patch generation for automated program repair
Alhefdhi, Abdulaziz
Dam, Hoa Khanh
Le-Cong, Thanh
Le, Bach
Ghose, Aditya
SOFTWARE QUALITY JOURNAL, 2025, 33 (01)
[7] Automated patch assessment for program repair at scale
He Ye
Matias Martinez
Martin Monperrus
Empirical Software Engineering, 2021, 26
[8] Be Realistic: Automated Program Repair is a Combination of Undecidable Problems
Nilizadeh, Amirfarhad
Leavens, Gary T.
INTERNATIONAL WORKSHOP ON AUTOMATED PROGRAM REPAIR (APR 2022), 2022, : 31 - 32
[9] Automated Program Repair and Test Overfitting: Measurements and Approaches using Formal Methods
Nilizadeh, Amirfarhad
2022 IEEE 15TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST 2022), 2022, : 480 - 482
[10] Exploring True Test Overfitting in Dynamic Automated Program Repair using Formal Methods
Nilizadeh, Amirfarhad
Leavens, Gary T.
Le, Xuan-Bach D.
Pasareanu, Corina S.
Cok, David R.
2021 14TH IEEE CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST 2021), 2021, : 229 - 240

← 1 2 3 →