A Comparative Analysis of Clone Detection Techniques on SemanticCloneBench

被引:1
|
作者
Rabbani, Sohaib Masood [1 ]
Gulzar, Nabeel Ahmad [1 ]
Arshad, Saad [1 ]
Abid, Shamsa [2 ]
Shamail, Shafay [1 ]
机构
[1] LUMS, Dept Comp Sci, SBASSE, Lahore, Pakistan
[2] Singapore Management Univ, Sch Comp & Informat Syst, Singapore, Singapore
来源
2022 IEEE 16TH INTERNATIONAL WORKSHOP ON SOFTWARE CLONES (IWSC 2022) | 2022年
关键词
Semantic Clone Detection; SemanticCloneBench; Deep Learning; Semantic Similarity; CodeBERT; Large-Variance Clones;
D O I
10.1109/IWSC55060.2022.00011
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Semantic code clone detection involves the detection of functionally similar code fragments which may otherwise be lexically, syntactically, or structurally dissimilar. The detection of semantic code clones has important applications in aspect mining and product line analysis. The accurate detection of semantic code clones is a challenging task and various techniques have been proposed. However, the evaluation of these techniques is performed using various datasets and we do not have a clear picture of the performance of these techniques relative to each other. Recently, SemanticCloneBench has been introduced as a benchmark for semantic clones. Now, we can use the SemanticCloneBench to effectively evaluate and compare the performance of semantic code clone detection techniques. In this paper, we compare the semantic code clone detection performance of three different code clone detection techniques namely FACER-CD, CodeBERT and NIL for Java code clones using SemanticCloneBench. FACER-CD performs API usage similarity-based clustering to detect clones, while CodeBERT is a deep-learning based approach which uses a pre-trained programming language model, and NIL is a token-based large-gapped code clones detector. FACER-CD, NIL, and CodeBERT show a recall of 64.3%, 12.7%, and 83.2% respectively on SemanticCloneBench. Using all three techniques together on the SemanticCloneBench dataset gives us an overall recall of 95.5% which is currently the best performance achieved on SemanticCloneBench.
引用
收藏
页码:16 / 22
页数:7
相关论文
共 50 条
  • [21] An evaluation of clone detection techniques for identifying crosscutting concerns
    Bruntink, M
    van Deursen, A
    Tourwé, T
    van Engelen, R
    20TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 2004, : 200 - 209
  • [22] Clone detection in source code by frequent itemset techniques
    Wahler, V
    Seipel, D
    Von Gudenberg, JW
    Fischer, G
    FOURTH IEEE INTERNATIONAL WORKSHOP ON SOURCE CODE ANALYSIS AND MANIPULATION, PROCEEDINGS, 2004, : 128 - 135
  • [23] Comparative analysis of unsupervised anomaly detection techniques for heat detection in dairy cattle
    Michelena, Alvaro
    Diaz-Longueira, Antonio
    Novais, Paulo
    Simic, Dragan
    Fontenla-Romero, Oscar
    Calvo-Rolle, Jose Luis
    NEUROCOMPUTING, 2025, 618
  • [24] COMPARATIVE ANALYSIS OF THE SPECTRUM SENSING TECHNIQUES ENERGY DETECTION AND CYCLOSTATIONARY FEATURE DETECTION
    Gill, Ramandeep
    Kansal, Ankush
    INTERNATIONAL JOURNAL ON INFORMATION TECHNOLOGIES AND SECURITY, 2015, 7 (03): : 23 - 30
  • [25] Comparative analysis of fixation techniques for signal detection in avian embryos
    Echeverria, Camilo, V
    Leathers, Tess A.
    Rogers, Crystal D.
    DEVELOPMENTAL BIOLOGY, 2025, 517 : 13 - 23
  • [26] Comparative Analysis of Edge Detection Techniques for extracting Refined Boundaries
    Sandhu, Parvinder Singh
    Juneja, Mamta
    Walia, Ekta
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (IACSIT ICMLC 2009), 2009, : 1 - 10
  • [27] A Comparative Analysis of Machine Learning Techniques for IoT Intrusion Detection
    Vitorino, Joao
    Andrade, Rui
    Praca, Isabel
    Sousa, Orlando
    Maia, Eva
    FOUNDATIONS AND PRACTICE OF SECURITY, FPS 2021, 2022, 13291 : 191 - 207
  • [28] A Comparative Analysis of Machine Learning Techniques for Classification and Detection of Malware
    Al-Janabi, Maryam
    Altamimi, Ahmad Mousa
    2020 21ST INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2020,
  • [29] Comparative analysis of two techniques for detection and segmentation of moving bodies
    Filitto, Danilo
    Hasegawa, Julio Kiyoshi
    Polidorio, Airton Marco
    Flores, Franklin Cesar
    2013 JOINT URBAN REMOTE SENSING EVENT (JURSE), 2013, : 9 - 12
  • [30] A Comparative Analysis of Different Intrusion Detection Techniques in Cloud Computing
    Bakshi, Aditya
    Sunanda
    ADVANCED INFORMATICS FOR COMPUTING RESEARCH, PT II, 2019, 956 : 358 - 378