Overfitting in Semantics-based Automated Program Repair

被引:14
|
作者
Le, Xuan-Bach D. [1 ]
Thung, Ferdian [1 ]
Lo, David [1 ]
Le Goues, Claire [2 ]
机构
[1] Singapore Management Univ, Singapore, Singapore
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
关键词
Automated Program Repair; Program Synthesis; Symbolic Execution; Patch Overfitting;
D O I
10.1145/3180155.3182536
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Existing APR techniques can be generally divided into two families: semantics- vs. heuristics-based. Semantics-based APR uses symbolic execution and test suites to extract semantic constraints, and uses program synthesis to synthesize repairs that satisfy the extracted constraints. Heuristic-based APR generates large populations of repair candidates via source manipulation, and searches for the best among them. Both families largely rely on a primary assumption that a program is correctly patched if the generated patch leads the program to pass all provided test cases. Patch correctness is thus an especially pressing concern. A repair technique may generate overfitting patches, which lead a program to pass all existing test cases, but fails to generalize beyond them. In this work, we revisit the overfitting problem with a focus on semantics-based APR techniques, complementing previous studies of the overfitting problem in heuristics-based APR. We perform our study using IntroClass and Codeflaws benchmarks, two datasets well-suited for assessing repair quality, to systematically characterize and understand the nature of overfitting in semantics-based APR. We find that similar to heuristics-based APR, overfitting also occurs in semantics-based APR in various different ways.
引用
收藏
页码:163 / 163
页数:1
相关论文
共 50 条
  • [31] A semantics-based approach to Malware detection
    Preda, Mila Dalla
    Christodorescu, Mihai
    Jha, Somesh
    Debray, Saumya
    ACM SIGPLAN NOTICES, 2007, 42 (01) : 377 - 388
  • [32] Semantics-based representation of virtual environments
    Gutierrez, Mario
    Vexo, Frederic
    Thalmann, Daniel
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2005, 23 (2-4) : 229 - 238
  • [33] A Semantics-Based Approach to Malware Detection
    Preda, Mila Dalla
    Christodorescu, Mihai
    Jha, Somesh
    Debray, Saumya
    CONFERENCE RECORD OF POPL 2007: THE 34TH ACM SIGPLAN SIGACT SYMPOSIUM ON PRINCIPLES OF PROGAMMING LANGUAGES, 2007, : 377 - 388
  • [34] Preparations for semantics-based XML mining
    Lee, JW
    Lee, K
    Kim, W
    2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, : 345 - 352
  • [35] Semantics-based reduction of curricular automata
    Hsia, YT
    Chu, KK
    EISTA '04: INTERNATIONAL CONFERENCE ON EDUCATION AND INFORMATION SYSTEMS: TECHNOLOGIES AND APPLICATIONS, VOL 1, PROCEEDINGS, 2004, : 13 - 18
  • [36] Semantics-based Color Assignment in Visualization
    Larrea, Martin L.
    Martig, Sergio R.
    Castro, Silvia M.
    JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY, 2010, 10 (01): : 14 - 18
  • [37] A method to identify overfitting program repair patches based on expression tree
    Dong, Yukun
    Cheng, Xiaotong
    Yang, Yufei
    Zhang, Lulu
    Wang, Shuqi
    Kong, Lingjie
    Science of Computer Programming, 2024, 235
  • [38] A method to identify overfitting program repair patches based on expression tree
    Dong, Yukun
    Cheng, Xiaotong
    Yang, Yufei
    Zhang, Lulu
    Wang, Shuqi
    Kong, Lingjie
    SCIENCE OF COMPUTER PROGRAMMING, 2024, 235
  • [39] Semantics-based dynamic service composition
    Fujii, K
    Suda, T
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2005, 23 (12) : 2361 - 2372
  • [40] Semantics-based composition of class hierarchies
    Snelting, G
    Tip, F
    ECOOP 2002 - OBJECT-ORIENTED PROGRAMMING, 2002, 2374 : 562 - 584