A Probabilistic Delta Debugging Approach for Abstract Syntax Trees

被引:1
|
作者
Wang, Guancheng [1 ]
Wu, Yiqian [1 ]
Zhu, Qihao [1 ]
Xiong, Yingfei [1 ]
Zhang, Xin [1 ]
Zhang, Lu [1 ]
机构
[1] Peking Univ, Sch Comp Sci, Minist Educ, Key Lab High Confidence Software Technol, Beijing 100871, Peoples R China
来源
2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, ISSRE | 2023年
基金
中国国家自然科学基金;
关键词
Delta Debugging; Probabilistic Model; Abstract Syntax Tree;
D O I
10.1109/ISSRE59848.2023.00060
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Delta debugging provides an efficient and systematic approach to isolate and identify a minimal subsequence that exhibit a specific property. A notable trend in the development of delta debugging is to address data with domain-specific structures, such as programs. However, the efficiency and effectiveness of domain-specific delta debugging algorithms still present challenges. Probabilistic delta debugging (ProbDD) enhances the ddmin algorithm, which forms the foundation of most domain-specific delta debugging approaches, by incorporating a probabilistic model. By replacing the ddmin component with ProbDD, algorithms relying on ddmin can achieve superior performance. Meanwhile, domain-specific delta debugging techniques, such as Perses, have been designed to cater to the abstract syntax tree (AST) and follow predefined sequences of attempts to minimize programs. These techniques benefit from the use of AST-based transformations, enabling them to achieve even smaller results efficiently. However, we observe that ProbDD assumes independence between elements, which may limit their performance in capturing syntactic relationships. Additionally, domain-specific approaches such as Perses rely on a predefined sequence of attempts the removal of the element and fail to utilize the information from existing test results. In this paper, we propose T-PDD, a novel approach that addresses these limitations. T-PDD leverages the AST to construct a probabilistic model, both utilizing historical test results and capturing syntactic relationships to estimate the probabilities of elements being retained in the result. It selects a set of elements that maximizes the gain for the next test based on the model and updates the model using the test results. In our evaluation, we assess our approach on 107 real-world subjects. The results demonstrate an average improvement of 26.95% in processing time and a 3.4x reduction in result size compared to Perses in the best-case scenario.
引用
收藏
页码:763 / 773
页数:11
相关论文
共 50 条
  • [31] CAST: Enhancing Code Summarization with Hierarchical Splitting and Reconstruction of Abstract Syntax Trees
    Shi, Ensheng
    Wang, Yanlin
    Du, Lun
    Zhang, Hongyu
    Han, Shi
    Zhang, Dongmei
    Sun, Hongbin
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 4053 - 4062
  • [32] Thinking in Blocks: Implications of Using Abstract Syntax Trees as the Underlying Program Model
    Wendel, Daniel
    Medlock-Walton, Paul
    2015 IEEE BLOCKS AND BEYOND WORKSHOP (BLOCKS AND BEYOND), 2015, : 63 - 66
  • [33] Detection of near-miss clones using metrics and Abstract Syntax Trees
    Vishwachi
    Gupta, Sonam
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICICCT), 2017, : 230 - 234
  • [34] Cross-language clone detection by learning over abstract syntax trees
    Perez, Daniel
    Chiba, Shigeru
    IEEE International Working Conference on Mining Software Repositories, 2019, 2019-May : 518 - 528
  • [35] AST-Transformer: Encoding Abstract Syntax Trees Efficiently for Code Summarization
    Tang, Ze
    Li, Chuanyi
    Ge, Jidong
    Shen, Xiaoyu
    Zhu, Zheling
    Luo, Bin
    2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021, 2021, : 1193 - 1195
  • [36] Abstract syntax from concrete syntax
    Wile, DS
    PROCEEDINGS OF THE 1997 INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, 1997, : 472 - 480
  • [37] Debugging Probabilistic Programs
    Nandi, Chandrakana
    Sampson, Adrian
    Mytkowicz, Todd
    McKinley, Kathryn S.
    MAPL'17: PROCEEDINGS OF THE 1ST ACM SIGPLAN INTERNATIONAL WORKSHOP ON MACHINE LEARNING AND PROGRAMMING LANGUAGES, 2017, : 18 - 26
  • [38] Probabilistic syntax
    Manning, CD
    PROBABILISTIC LINGUISTICS, 2003, : 289 - 341
  • [39] A Program Plagiarism Detection Approach Based On Abstract Syntax Tree
    Xiong, Hao
    Yan, Hai-hua
    Li, Zhou-jun
    Li, Hu
    ICAIE 2009: PROCEEDINGS OF THE 2009 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND EDUCATION, VOLS 1 AND 2, 2009, : 196 - 205
  • [40] Iterative Delta Debugging
    Artho, Cyrille
    HARDWARE AND SOFTWARE: VERIFICATION AND TESTING, PROCEEDINGS, 2009, 5394 : 99 - 113