The Patch Overfitting Problem in Automated Program Repair: Practical Magnitude and a Baseline for Realistic Benchmarking

被引:0
|
作者
Petke, Justyna [1 ]
Martinez, Matias [2 ]
Kechagia, Maria [1 ]
Aleti, Aldeida [3 ]
Sarro, Federica [1 ]
机构
[1] UCL, London, England
[2] Univ Politecn Cataluna, BarcelonaTech, Barcelona, Spain
[3] Monash Univ, Melbourne, Vic, Australia
基金
澳大利亚研究理事会; 英国工程与自然科学研究理事会;
关键词
Overfitting; Automated Program Repair; Patch Assessment;
D O I
10.1145/3663529.3663776
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Automated program repair techniques aim to generate patches for software bugs, mainly relying on testing to check their validity. The generation of a large number of such plausible yet incorrect patches is widely believed to hinder wider application of APR in practice, which has motivated research in automated patch assessment. We reflect on the validity of this motivation and carry out an empirical study to analyse the extent to which 10 APR tools suffer from the overfitting problem in practice. We observe that the number of plausible patches generated by any of the APR tools analysed for a given bug from the Defects4J dataset is remarkably low, a median of 2, indicating that a developer only needs to consider 2 patches in most cases to be confident to find a fix or confirming its nonexistence. This study unveils that the overfitting problem might not be as bad as previously thought. We reflect on current evaluation strategies of automated patch assessment techniques and propose a Random Selection baseline to assess whether and when using such techniques is beneficial for reducing human effort. We advocate future work should evaluate the benefit arising from patch overfitting assessment usage against the random baseline.
引用
收藏
页码:452 / 456
页数:5
相关论文
共 23 条
  • [1] Overfitting in semantics-based automated program repair
    Le, Xuan Bach D.
    Thung, Ferdian
    Lo, David
    Le Goues, Claire
    EMPIRICAL SOFTWARE ENGINEERING, 2018, 23 (05) : 3007 - 3033
  • [2] Overfitting in semantics-based automated program repair
    Xuan Bach D. Le
    Ferdian Thung
    David Lo
    Claire Le Goues
    Empirical Software Engineering, 2018, 23 : 3007 - 3033
  • [3] Overfitting in Semantics-based Automated Program Repair
    Le, Xuan-Bach D.
    Thung, Ferdian
    Lo, David
    Le Goues, Claire
    PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2018, : 163 - 163
  • [4] Is the Cure Worse Than the Disease? Overfitting in Automated Program Repair
    Smith, Edward K.
    Barr, Earl T.
    Le Goues, Claire
    Brun, Yuriy
    2015 10TH JOINT MEETING OF THE EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND THE ACM SIGSOFT SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE 2015) PROCEEDINGS, 2015, : 532 - 543
  • [5] Automated patch assessment for program repair at scale
    Ye, He
    Martinez, Matias
    Monperrus, Martin
    EMPIRICAL SOFTWARE ENGINEERING, 2021, 26 (02)
  • [6] Adversarial patch generation for automated program repair
    Alhefdhi, Abdulaziz
    Dam, Hoa Khanh
    Le-Cong, Thanh
    Le, Bach
    Ghose, Aditya
    SOFTWARE QUALITY JOURNAL, 2025, 33 (01)
  • [7] Automated patch assessment for program repair at scale
    He Ye
    Matias Martinez
    Martin Monperrus
    Empirical Software Engineering, 2021, 26
  • [8] Be Realistic: Automated Program Repair is a Combination of Undecidable Problems
    Nilizadeh, Amirfarhad
    Leavens, Gary T.
    INTERNATIONAL WORKSHOP ON AUTOMATED PROGRAM REPAIR (APR 2022), 2022, : 31 - 32
  • [9] Automated Program Repair and Test Overfitting: Measurements and Approaches using Formal Methods
    Nilizadeh, Amirfarhad
    2022 IEEE 15TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST 2022), 2022, : 480 - 482
  • [10] Exploring True Test Overfitting in Dynamic Automated Program Repair using Formal Methods
    Nilizadeh, Amirfarhad
    Leavens, Gary T.
    Le, Xuan-Bach D.
    Pasareanu, Corina S.
    Cok, David R.
    2021 14TH IEEE CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST 2021), 2021, : 229 - 240