A Comparative Analysis of Clone Detection Techniques on SemanticCloneBench

被引:1
|
作者
Rabbani, Sohaib Masood [1 ]
Gulzar, Nabeel Ahmad [1 ]
Arshad, Saad [1 ]
Abid, Shamsa [2 ]
Shamail, Shafay [1 ]
机构
[1] LUMS, Dept Comp Sci, SBASSE, Lahore, Pakistan
[2] Singapore Management Univ, Sch Comp & Informat Syst, Singapore, Singapore
来源
2022 IEEE 16TH INTERNATIONAL WORKSHOP ON SOFTWARE CLONES (IWSC 2022) | 2022年
关键词
Semantic Clone Detection; SemanticCloneBench; Deep Learning; Semantic Similarity; CodeBERT; Large-Variance Clones;
D O I
10.1109/IWSC55060.2022.00011
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Semantic code clone detection involves the detection of functionally similar code fragments which may otherwise be lexically, syntactically, or structurally dissimilar. The detection of semantic code clones has important applications in aspect mining and product line analysis. The accurate detection of semantic code clones is a challenging task and various techniques have been proposed. However, the evaluation of these techniques is performed using various datasets and we do not have a clear picture of the performance of these techniques relative to each other. Recently, SemanticCloneBench has been introduced as a benchmark for semantic clones. Now, we can use the SemanticCloneBench to effectively evaluate and compare the performance of semantic code clone detection techniques. In this paper, we compare the semantic code clone detection performance of three different code clone detection techniques namely FACER-CD, CodeBERT and NIL for Java code clones using SemanticCloneBench. FACER-CD performs API usage similarity-based clustering to detect clones, while CodeBERT is a deep-learning based approach which uses a pre-trained programming language model, and NIL is a token-based large-gapped code clones detector. FACER-CD, NIL, and CodeBERT show a recall of 64.3%, 12.7%, and 83.2% respectively on SemanticCloneBench. Using all three techniques together on the SemanticCloneBench dataset gives us an overall recall of 95.5% which is currently the best performance achieved on SemanticCloneBench.
引用
收藏
页码:16 / 22
页数:7
相关论文
共 50 条
  • [31] A Comparative Analysis of Machine Learning Techniques for Cyberbullying Detection on Twitter
    Muneer, Amgad
    Fati, Suliman Mohamed
    FUTURE INTERNET, 2020, 12 (11) : 1 - 21
  • [32] Clone Analysis and Detection in Android Applications
    Niu, Haofei
    Yang, Tianchang
    Niu, Shaozhang
    2016 3RD INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2016, : 520 - 525
  • [33] Comparison and Evaluation of Clone Detection Techniques with Different Code Representations
    Wang, Yuekun
    Ye, Yuhang
    Wu, Yueming
    Zhang, Weiwei
    Xue, Yinxing
    Liu, Yang
    2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 332 - 344
  • [34] Code Clone Detection Techniques Based on Large Language Models
    Almatrafi, Afnan A.
    Eassa, Fathy A.
    Sharaf, Sanaa A.
    IEEE ACCESS, 2025, 13 : 46136 - 46146
  • [35] Clone Detection Techniques for Java']JavaScript and Language Independence: Review
    Alfageh, Danyah
    Alhakami, Hosam
    Baz, Abdullah
    Alanazi, Eisa
    Alsubait, Tahani
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (04) : 787 - 795
  • [36] Systematic Mapping Study of Metrics based Clone Detection Techniques
    Rattan, Dhavleesh
    Kaur, Jagdeep
    INTERNATIONAL CONFERENCE ON ADVANCES IN INFORMATION COMMUNICATION TECHNOLOGY & COMPUTING, 2016, 2016,
  • [37] Various Code Clone Detection Techniques and Tools: A Comprehensive Survey
    Gautam, Pratiksha
    Saini, Hemraj
    SMART TRENDS IN INFORMATION TECHNOLOGY AND COMPUTER COMMUNICATIONS, SMARTCOM 2016, 2016, 628 : 655 - 667
  • [38] Improving Clone Detection Precision using Machine Learning Techniques
    Arammongkolvichai, Vara
    Koschke, Rainer
    Ragkhitwetsagul, Chaiyong
    Choetkiertikul, Morakot
    Sunetnanta, Thanwadee
    2019 10TH INTERNATIONAL WORKSHOP ON EMPIRICAL SOFTWARE ENGINEERING IN PRACTICE (IWESEP 2019), 2019, : 31 - 36
  • [39] Comparative Analysis of High Impedance Fault Detection Techniques on Distribution Networks
    Hamatwi, Ester
    Imoru, Odunayo
    Kanime, Matheus M. M.
    Kanelombe, Hitila S. A.
    IEEE ACCESS, 2023, 11 : 25817 - 25834
  • [40] Comparative Analysis of Classification Techniques in Network Based Intrusion Detection Systems
    Gautam, Sunil Kumar
    Om, Hari
    PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND COMMUNICATION, 2017, 458 : 591 - 601