Exploring the Effectiveness of Abstract Syntax Tree Patterns for Algorithm Recognition

被引:0
|
作者
Neumuller, Denis [1 ]
Straub, Raphael [1 ]
Sihler, Florian [1 ]
Tichy, Matthias [1 ]
机构
[1] Univ Ulm, Ulm, Germany
关键词
algorithm recognition; program comprehension; pattern matching; abstract syntax tree; domain-specific language; reverse engineering; maintenance; OPEN-SOURCE SOFTWARE;
D O I
10.1109/ICCQ60895.2024.10576984
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The automated recognition of algorithm implementations can support many software maintenance and reengineering activities by providing knowledge about the concerns present in the code base. Moreover, recognizing inefficient algorithms like Bubble Sort and suggesting superior alternatives from a library can help in assessing and improving the quality of a system. Approaches from related work suffer from usability as well as scalability issues and their accuracy is not evaluated. In this paper, we investigate how well our approach based on the abstract syntax tree of a program performs for automatic algorithm recognition. To this end, we have implemented a prototype consisting of: A domain-specific language designed to capture the key features of an algorithm and used to express a search pattern on the abstract syntax tree, a matching algorithm to find these features, and an initial catalog of "ready to use" patterns. To create our search patterns we performed a web search using the algorithm name and described key features of the found reference implementations with our domain-specific language. We evaluate our prototype on a subset of the BigCloneEval benchmark containing algorithms like Fibonacci, Bubble Sort, and Binary Search. We achieve an average F-1-score of 0.74 outperforming the large language model Codellama which attains 0.35. Additionally, we use multiple code clone detection tools as a baseline for comparison, achieving a recall of 0.62 while the best-performing tool reaches 0.20.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Multi-language Webshell Detection based on Abstract Syntax Tree and TreeLSTM
    Shang, Mengchuan
    Han, Xueying
    Zhao, Changzhi
    Cui, Zelin
    Du, Dan
    Jiang, Bo
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 377 - 382
  • [42] AAHEG: Automatic Advanced Heap Exploit Generation Based on Abstract Syntax Tree
    Wang, Yu
    Zhang, Yipeng
    Li, Zhoujun
    SYMMETRY-BASEL, 2023, 15 (12):
  • [43] Pattern Mining-Based Warning Prioritization by Refining Abstract Syntax Tree
    Ge, Xiuting
    Li, Xuanye
    Sun, Yuanyuan
    Qing, Mingshuang
    Zheng, Haitao
    Zhang, Huibin
    Wu, Xianyu
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2024, 34 (10) : 1593 - 1619
  • [44] ATOM: Commit Message Generation Based on Abstract Syntax Tree and Hybrid Ranking
    Liu, Shangqing
    Gao, Cuiyun
    Chen, Sen
    Nie, Lun Yiu
    Liu, Yang
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (05) : 1800 - 1817
  • [45] Effective pattern matching of source code using abstract syntax patterns
    Atkinson, DC
    Griswold, WG
    SOFTWARE-PRACTICE & EXPERIENCE, 2006, 36 (04): : 413 - 447
  • [46] An Abstract Syntax Tree based static fuzzing mutation for vulnerability evolution analysis
    Zheng, Wei
    Deng, Peiran
    Gui, Kui
    Wu, Xiaoxue
    INFORMATION AND SOFTWARE TECHNOLOGY, 2023, 158
  • [47] Improving the Performance of Code Vulnerability Prediction using Abstract Syntax Tree Information
    Al Debeyan, Fahad
    Hall, Tracy
    Bowes, David
    PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON PREDICTIVE MODELS AND DATA ANALYTICS IN SOFTWARE ENGINEERING, PROMISE 2022, 2022, : 2 - 11
  • [48] An efficient embedding tree matching algorithm based on metaphoric dependency syntax tree
    Feng Shao-rong
    Xiao Wen-jun
    JOURNAL OF CENTRAL SOUTH UNIVERSITY OF TECHNOLOGY, 2009, 16 (02): : 275 - 279
  • [49] An efficient embedding tree matching algorithm based on metaphoric dependency syntax tree
    Shao-rong Feng
    Wen-jun Xiao
    Journal of Central South University of Technology, 2009, 16 : 275 - 279
  • [50] An efficient embedding tree matching algorithm based on metaphoric dependency syntax tree
    冯少荣
    肖文俊
    JournalofCentralSouthUniversityofTechnology, 2009, 16 (02) : 275 - 279