Exploring the Effectiveness of Abstract Syntax Tree Patterns for Algorithm Recognition

被引:0
|
作者
Neumuller, Denis [1 ]
Straub, Raphael [1 ]
Sihler, Florian [1 ]
Tichy, Matthias [1 ]
机构
[1] Univ Ulm, Ulm, Germany
关键词
algorithm recognition; program comprehension; pattern matching; abstract syntax tree; domain-specific language; reverse engineering; maintenance; OPEN-SOURCE SOFTWARE;
D O I
10.1109/ICCQ60895.2024.10576984
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The automated recognition of algorithm implementations can support many software maintenance and reengineering activities by providing knowledge about the concerns present in the code base. Moreover, recognizing inefficient algorithms like Bubble Sort and suggesting superior alternatives from a library can help in assessing and improving the quality of a system. Approaches from related work suffer from usability as well as scalability issues and their accuracy is not evaluated. In this paper, we investigate how well our approach based on the abstract syntax tree of a program performs for automatic algorithm recognition. To this end, we have implemented a prototype consisting of: A domain-specific language designed to capture the key features of an algorithm and used to express a search pattern on the abstract syntax tree, a matching algorithm to find these features, and an initial catalog of "ready to use" patterns. To create our search patterns we performed a web search using the algorithm name and described key features of the found reference implementations with our domain-specific language. We evaluate our prototype on a subset of the BigCloneEval benchmark containing algorithms like Fibonacci, Bubble Sort, and Binary Search. We achieve an average F-1-score of 0.74 outperforming the large language model Codellama which attains 0.35. Additionally, we use multiple code clone detection tools as a baseline for comparison, achieving a recall of 0.62 while the best-performing tool reaches 0.20.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Clone detection algorithm based on the Abstract Syntax Tree approach
    Lazar, Flavius-Mihai
    Banias, Ovidiu
    2014 IEEE 9TH INTERNATIONAL SYMPOSIUM ON APPLIED COMPUTATIONAL INTELLIGENCE AND INFORMATICS (SACI), 2014, : 73 - 78
  • [2] Code Summarization with Abstract Syntax Tree
    Chen, Qiuyuan
    Hu, Han
    Liu, Zhaoyi
    NEURAL INFORMATION PROCESSING, ICONIP 2019, PT V, 2019, 1143 : 652 - 660
  • [3] iASTMapper: An Iterative Similarity-Based Abstract Syntax Tree Mapping Algorithm
    Zhang, Neng
    Chen, Qinde
    Zheng, Zibin
    Zou, Ying
    2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 863 - 874
  • [4] iASTMapper: An Iterative Similarity-Based Abstract Syntax Tree Mapping Algorithm
    Zhang, Neng
    Chen, Qinde
    Zheng, Zibin
    Zou, Ying
    Proceedings - 2023 38th IEEE/ACM International Conference on Automated Software Engineering, ASE 2023, 2023, : 863 - 874
  • [5] Identifying Energy Efficiency Patterns in Sorting Algorithms via Abstract Syntax Tree Mining
    Krauss, Oliver
    Schuler, Andreas
    Proceedings of the International Conference on Modeling and Applied Simulation, MAS, 2023, 2023-September
  • [6] Inferring Bug Patterns for Detecting Bugs in Java']JavaScript By Analyzing Abstract Syntax Tree
    Tasnim, Afsana
    Rahman, Md Rayhanur
    2018 JOINT 7TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV) AND 2018 2ND INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR), 2018, : 503 - 507
  • [7] A Fast Abstract Syntax Tree Interpreter for R
    Kalibera, Tomas
    Maj, Petr
    Morandat, Floreal
    Vitek, Jan
    ACM SIGPLAN NOTICES, 2014, 49 (07) : 89 - 102
  • [8] AN ALGORITHM FOR GENERATING ABSTRACT SYNTAX TREES
    NOONAN, RE
    COMPUTER LANGUAGES, 1985, 10 (3-4): : 225 - 236
  • [9] Static code detection based on abstract syntax tree
    Lu, Xiaofeng
    Fang, Denghui
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 125 : 195 - 195
  • [10] Source Code Pattern as Anchored Abstract Syntax Tree
    Nakayama, Ken
    Sakai, Eko
    2014 5TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2014, : 170 - 173