A Comparative Analysis of Clone Detection Techniques on SemanticCloneBench

被引:1
|
作者
Rabbani, Sohaib Masood [1 ]
Gulzar, Nabeel Ahmad [1 ]
Arshad, Saad [1 ]
Abid, Shamsa [2 ]
Shamail, Shafay [1 ]
机构
[1] LUMS, Dept Comp Sci, SBASSE, Lahore, Pakistan
[2] Singapore Management Univ, Sch Comp & Informat Syst, Singapore, Singapore
关键词
Semantic Clone Detection; SemanticCloneBench; Deep Learning; Semantic Similarity; CodeBERT; Large-Variance Clones;
D O I
10.1109/IWSC55060.2022.00011
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Semantic code clone detection involves the detection of functionally similar code fragments which may otherwise be lexically, syntactically, or structurally dissimilar. The detection of semantic code clones has important applications in aspect mining and product line analysis. The accurate detection of semantic code clones is a challenging task and various techniques have been proposed. However, the evaluation of these techniques is performed using various datasets and we do not have a clear picture of the performance of these techniques relative to each other. Recently, SemanticCloneBench has been introduced as a benchmark for semantic clones. Now, we can use the SemanticCloneBench to effectively evaluate and compare the performance of semantic code clone detection techniques. In this paper, we compare the semantic code clone detection performance of three different code clone detection techniques namely FACER-CD, CodeBERT and NIL for Java code clones using SemanticCloneBench. FACER-CD performs API usage similarity-based clustering to detect clones, while CodeBERT is a deep-learning based approach which uses a pre-trained programming language model, and NIL is a token-based large-gapped code clones detector. FACER-CD, NIL, and CodeBERT show a recall of 64.3%, 12.7%, and 83.2% respectively on SemanticCloneBench. Using all three techniques together on the SemanticCloneBench dataset gives us an overall recall of 95.5% which is currently the best performance achieved on SemanticCloneBench.
引用
收藏
页码:16 / 22
页数:7
相关论文
共 50 条
  • [1] A Comparative Analysis of Clone Detection Techniques on SemanticCloneBench
    Rabbani, Sohaib Masood
    Ahmad Gulzar, Nabeel
    Arshad, Saad
    Abid, Shamsa
    Shamail, Shafay
    Proceedings - 2022 IEEE 16th International Workshop on Software Clones, IWSC 2022, 2022, : 16 - 22
  • [2] Comparative analysis for Edge Detection Techniques
    Kumar, Sunil
    Upadhyay, Amit Kumar
    Dubey, Preeti
    Varshney, Sudeep
    2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, AND INTELLIGENT SYSTEMS (ICCCIS), 2021, : 675 - 681
  • [3] SemanticCloneBench: A Semantic Code Clone Benchmark using Crowd-Source Knowledge
    Al-omari, Farouq
    Roy, Chanchal K.
    Chen, Tonghao
    PROCEEDINGS OF THE 2020 IEEE 14TH INTERNATIONAL WORKSHOP ON SOFTWARE CLONES (IWSC '20), 2020, : 57 - 63
  • [4] A Comparative Study of Clone Detection Approaches
    Laxmi
    Duhan, Neelam
    Jameel, Huma
    PROCEEDINGS OF THE 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2019, : 1025 - 1027
  • [5] Comparative Analysis of Android Malware Detection Techniques
    Painter, Nishant
    Kadhiwala, Bintu
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT 2016, VOL 2, 2017, 469 : 131 - 139
  • [6] A Comparative Analysis of Phishing Detection and Prevention Techniques
    Sharma, Shivangi
    Kalra, Sheetal
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2016, 9 (08): : 371 - 384
  • [7] Comparative analysis of morphological techniques for malaria detection
    Pattanaik P.A.
    Swarnkar T.
    2018, IGI Global (13) : 49 - 65
  • [8] Comparative Analysis of Morphological Techniques for Malaria Detection
    Pattanaik, P. A.
    Swarnkar, Tripti
    INTERNATIONAL JOURNAL OF HEALTHCARE INFORMATION SYSTEMS AND INFORMATICS, 2018, 13 (04) : 49 - 65
  • [9] Machine Learning Techniques for Intrusion Detection: A Comparative Analysis
    Hamid, Yasir
    Sugumaran, M.
    Journaux, Ludovic
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATICS AND ANALYTICS (ICIA' 16), 2016,
  • [10] A Comparative Analysis of Segmentation Techniques for Lung Cancer Detection
    Priyanshu Tripathi
    Shweta Tyagi
    Madhwendra Nath
    Pattern Recognition and Image Analysis, 2019, 29 : 167 - 173