CMCD: Count Matrix based Code Clone Detection

被引:0
|
作者
Yuan, Yang [1 ]
Guo, Yao [1 ]
机构
[1] Peking Univ, Natl Engn Res Ctr Software Engn, Key Lab High Confidence Software Technol, Minist Educ,Sch Elect Engn & Comp Sci, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
Code clone detection; count matrix; bipartite graph matching; SOFTWARE;
D O I
10.1109/APSC.2011.13
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper introduces CMCD, a Count Matrix based technique to detect clones in program code. The key concept behind CMCD is Count Matrix, which is created while counting the occurrence frequencies of every variable in situations specified by pre-determined counting conditions. Because the characteristics of the count matrix do not change due to variable name replacements or even switching of statements, CMCD works well on many hard-to-detect code clones, such as swapping statements or deleting a few lines, which are difficult for other state-of-the-art detection techniques. We have obtained the following interesting results using CMCD: (1) we successfully detected all 16 clone scenarios proposed by C. Roy et al.; (2) we discovered two clone clusters with three copies each from 29 student-submitted compiler lab projects; (3) we identified 174 code clone clusters and a potential bug from JDK 1.6 source files.
引用
收藏
页码:250 / 257
页数:8
相关论文
共 50 条
  • [41] Multi-threshold token-based code clone detection
    Golubev, Yaroslav
    Poletansky, Viktor
    Povarov, Nikita
    Bryksin, Timofey
    2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2021), 2021, : 496 - 500
  • [42] VFDETECT: A Vulnerable Code Clone Detection System Based on Vulnerability Fingerprint
    Liu, Zhen
    Wei, Qiang
    Cao, Yan
    2017 IEEE 3RD INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC), 2017, : 548 - 553
  • [43] An enhanced transformer-based framework for interpretable code clone detection
    Nashaat, Mona
    Amin, Reem
    Eid, Ahmad Hosny
    Abdel-Kader, Rabab F.
    JOURNAL OF SYSTEMS AND SOFTWARE, 2025, 222
  • [44] Gapped Code Clone Detection with Lightweight Source Code Analysis
    Murakami, Hiroaki
    Hotta, Keisuke
    Higo, Yoshiki
    Igaki, Hiroshi
    Kusumoto, Shinji
    2013 IEEE 21ST INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC), 2013, : 93 - 102
  • [45] SourcererCC: Scaling Code Clone Detection to Big-Code
    Sajnani, Hitesh
    Saini, Vaibhav
    Svajlenko, Jeffrey
    Roy, Chanchal K.
    Lopes, Cristina V.
    2016 IEEE/ACM 38TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2016, : 1157 - 1168
  • [46] Code Clone Detection Using Decentralized Architecture and Code Reduction
    Patil, Ritesh V.
    Joshi, Shashank D.
    Shinde, Sachin V.
    Ajagekar, Digvijay A.
    Bankar, Shubham D.
    2015 INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING (ICPC), 2015,
  • [47] Learn To Align: A Code Alignment Network For Code Clone Detection
    Zhang, Aiping
    Liu, Kui
    Fang, Liming
    Liu, Qianjun
    Yun, Xinyu
    Ji, Shouling
    2021 28TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2021), 2021, : 1 - 11
  • [48] Efficient transformer with code token learner for code clone detection
    Zhang, Aiping
    Fang, Liming
    Ge, Chunpeng
    Li, Piji
    Liu, Zhe
    JOURNAL OF SYSTEMS AND SOFTWARE, 2023, 197
  • [49] SimilaR: R Code Clone and Plagiarism Detection
    Bartoszuk, Maciej
    Gagolewski, Marek
    R JOURNAL, 2020, 12 (01): : 367 - 385
  • [50] Code Clone Detection on Specialized PDGs with Heuristic
    Higo, Yoshiki
    Kusumoto, Shinji
    2011 15TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING (CSMR), 2011, : 75 - 84