Gapped Code Clone Detection with Lightweight Source Code Analysis

被引:0
|
作者
Murakami, Hiroaki [1 ]
Hotta, Keisuke [1 ]
Higo, Yoshiki [1 ]
Igaki, Hiroshi [1 ]
Kusumoto, Shinji [1 ]
机构
[1] Osaka Univ, Grad Sch Informat Sci & Technol, Suita, Osaka 5650871, Japan
关键词
Code Clone; Program Analysis; Software Maintenance; Tool Comparison; SYSTEM;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A variety of methods detecting code clones has been proposed before. In order to detect gapped code clones, AST-based technique, PDG-based technique, metric-based technique and text-based technique using the LCS algorithm have been proposed. However, each of those techniques has limitations. For example, existing AST-based techniques and PDG-based techniques require costs for transforming source files into intermediate representations such as ASTs or PDGs and comparing them. Existing metric-based techniques and text-based techniques using the LCS algorithm cannot detect code clones if methods or blocks are partially duplicated. This paper proposes a new method that detects gapped code clones using the Smith-Waterman algorithm to resolve those limitations. The Smith-Waterman algorithm is an algorithm for identifying similar alignments between two sequences even if they include some gaps. The authors developed the proposed method as a software tool named CDSW, and confirmed that the proposed method could resolve the limitations by conducting a quantitative evaluation with Bellon's benchmark.
引用
收藏
页码:93 / 102
页数:10
相关论文
共 50 条
  • [41] ChatGPT Code Detection: Techniques for Uncovering the Source of Code
    Oedingen, Marc
    Engelhardt, Raphael C.
    Denz, Robin
    Hammer, Maximilian
    Konen, Wolfgang
    AI, 2024, 5 (03) : 1066 - 1094
  • [42] Clone Detection in Test Code: An Empirical Evaluation
    van Bladel, Brent
    Demeyer, Serge
    PROCEEDINGS OF THE 2020 IEEE 27TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION, AND REENGINEERING (SANER '20), 2020, : 492 - 500
  • [43] Experiments on Code Clone Detection and Machine Learning
    Schaefer, Andre
    Amme, Wolfram
    Heinze, Thomas S.
    2022 IEEE 16TH INTERNATIONAL WORKSHOP ON SOFTWARE CLONES (IWSC 2022), 2022, : 46 - 52
  • [44] CodeBERT for Code Clone Detection: A Replication Study
    Arshad, Saad
    Abid, Shamsa
    Shamail, Shafay
    2022 IEEE 16TH INTERNATIONAL WORKSHOP ON SOFTWARE CLONES (IWSC 2022), 2022, : 39 - 45
  • [45] Cross-Language Code Similarity and Applications in Clone Detection and Code Search
    Mathew, George Varghese
    ProQuest Dissertations and Theses Global, 2022,
  • [46] Comparison and Visualization of Code Clone Detection Results
    Matsushima, Kazuki
    Inoue, Katsuro
    PROCEEDINGS OF THE 2020 IEEE 14TH INTERNATIONAL WORKSHOP ON SOFTWARE CLONES (IWSC '20), 2020, : 45 - 51
  • [47] Semantic Code Clone Detection for Enterprise Applications
    Svacina, Jan
    Simmons, Jonathan
    Cerny, Tomas
    PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 129 - 131
  • [48] Obfuscated code is identifiable by a token-based code clone detection technique
    Akram, Junaid
    Vasan, Danish
    Luo, Ping
    INTERNATIONAL JOURNAL OF INFORMATION AND COMPUTER SECURITY, 2022, 19 (3-4) : 254 - 273
  • [49] Applying a code clone detection method to domain analysis of device drivers
    Ma, Yu-Seung
    Woo, Duk-Kuyn
    14TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, PROCEEDINGS, 2007, : 254 - +
  • [50] Domain analysis of device drivers using code clone detection method
    Ma, Yu-Seung
    Woo, Duk-Kyun
    ETRI JOURNAL, 2008, 30 (03) : 394 - 402