Overfitting in Semantics-based Automated Program Repair

被引:14
|
作者
Le, Xuan-Bach D. [1 ]
Thung, Ferdian [1 ]
Lo, David [1 ]
Le Goues, Claire [2 ]
机构
[1] Singapore Management Univ, Singapore, Singapore
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
关键词
Automated Program Repair; Program Synthesis; Symbolic Execution; Patch Overfitting;
D O I
10.1145/3180155.3182536
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Existing APR techniques can be generally divided into two families: semantics- vs. heuristics-based. Semantics-based APR uses symbolic execution and test suites to extract semantic constraints, and uses program synthesis to synthesize repairs that satisfy the extracted constraints. Heuristic-based APR generates large populations of repair candidates via source manipulation, and searches for the best among them. Both families largely rely on a primary assumption that a program is correctly patched if the generated patch leads the program to pass all provided test cases. Patch correctness is thus an especially pressing concern. A repair technique may generate overfitting patches, which lead a program to pass all existing test cases, but fails to generalize beyond them. In this work, we revisit the overfitting problem with a focus on semantics-based APR techniques, complementing previous studies of the overfitting problem in heuristics-based APR. We perform our study using IntroClass and Codeflaws benchmarks, two datasets well-suited for assessing repair quality, to systematically characterize and understand the nature of overfitting in semantics-based APR. We find that similar to heuristics-based APR, overfitting also occurs in semantics-based APR in various different ways.
引用
收藏
页码:163 / 163
页数:1
相关论文
共 50 条
  • [21] Semantics-Based Code Search
    Reiss, Steven P.
    2009 31ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, PROCEEDINGS, 2009, : 243 - 253
  • [22] A semantics-based consultations workbench
    Vassilakis, C
    Gouscos, D
    Georgiadis, P
    Enabling Technologies for the New Knowledge Society, 2005, : 421 - 434
  • [23] Semantics-based retrieval by content
    Del Bimbo, A
    2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS, 2000, : 516 - 519
  • [24] Exploring True Test Overfitting in Dynamic Automated Program Repair using Formal Methods
    Nilizadeh, Amirfarhad
    Leavens, Gary T.
    Le, Xuan-Bach D.
    Pasareanu, Corina S.
    Cok, David R.
    2021 14TH IEEE CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST 2021), 2021, : 229 - 240
  • [25] The Patch Overfitting Problem in Automated Program Repair: Practical Magnitude and a Baseline for Realistic Benchmarking
    Petke, Justyna
    Martinez, Matias
    Kechagia, Maria
    Aleti, Aldeida
    Sarro, Federica
    COMPANION PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, FSE COMPANION 2024, 2024, : 452 - 456
  • [26] JFIX: Semantics-Based Repair of Java']Java Programs via Symbolic PathFinder
    Le, Xuan-Bach D.
    Duc-Hiep Chu
    Lo, David
    Le Goues, Claire
    Visser, Willem
    PROCEEDINGS OF THE 26TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS (ISSTA'17), 2017, : 376 - 379
  • [27] A semantics-based approach to malware detection
    Preda, Mila Dalla
    Christodorescu, Mihai
    Jha, Somesh
    Debray, Saumya
    ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 2008, 30 (05):
  • [28] Searching the web: A semantics-based approach
    Cao, TH
    Nguyen, THD
    Qui, TCT
    MODELLING, SIMULATION AND OPTIMIZATION OF COMPLEX PROCESSES, 2005, : 57 - 68
  • [29] Semantics-Based News Delivering Service
    Yokoo, Ryohei
    Kawamura, Takahiro
    Ohsuga, Akihiko
    INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2016, 10 (04) : 445 - 459
  • [30] PLAY: Semantics-Based Event Marketplace
    Stuehmer, Roland
    Verginadis, Yiannis
    Alshabani, Iyad
    Morsellino, Thomas
    Aversa, Antonio
    COLLABORATIVE SYSTEMS FOR REINDUSTRIALIZATION, 2013, 408 : 699 - 707