A Proposed Model for Source Code Reuse Detection in Computer Programs

被引:0
|
作者
Zahra Setoodeh
Mohammad Reza Moosavi
Mostafa Fakhrahmad
Mohammad Bidoki
机构
[1] Shiraz University,Department of Computer Science and Engineering and IT
[2] Persian Gulf University,Department of Computer Engineering, School of Engineering
来源
Iranian Journal of Science and Technology, Transactions of Electrical Engineering | 2021年 / 45卷
关键词
Plagiarism detection; Source code reuse; SOCO; Structure-based approach;
D O I
暂无
中图分类号
学科分类号
摘要
Source code reuse detection has become of growing significance as a common plagiarism prevention practice in academic research. For a large collection of source codes, the manual detection of the code reuse seems impractical, and there is a vital need for automatic and highly accurate tools. This paper introduces a structure-based approach for recognizing source code (SOCO) reuse in reference programs. The proposed model consists of the three main phases; preprocessing, sequence generation, and decision-making based on estimated similarities. Firstly, important instructions in each code file are identified, and source code is converted to a string of specific tokens. A sequence alignment process is then carried out, and the tree representation of the source code is constructed. In the third phase, the similarity values among the code files are estimated using three different innovative strategies based on both lexical and structural comparison of source codes. Finally, the system decides on each pair of files. The SOCO-2014 corpus is used for evaluating the method. The comparative experimental results of our model and that of the contest participants indicate that our proposed method’s performance is acceptable and promising.
引用
收藏
页码:1001 / 1014
页数:13
相关论文
共 50 条
  • [1] A Proposed Model for Source Code Reuse Detection in Computer Programs
    Setoodeh, Zahra
    Moosavi, Mohammad Reza
    Fakhrahmad, Mostafa
    Bidoki, Mohammad
    IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY-TRANSACTIONS OF ELECTRICAL ENGINEERING, 2021, 45 (03) : 1001 - 1014
  • [2] Towards the Detection of Cross-Language Source Code Reuse
    Flores, Enrique
    Barron-Cedeno, Alberto
    Rosso, Paolo
    Moreno, Lidia
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, 2011, 6716 : 250 - 253
  • [3] Reuse of Patterns' Source Code
    Jakubik, Jaroslav
    Navrat, Pavol
    KNOWLEDGE-BASED SOFTWARE ENGINEERING, 2006, 140 : 143 - 146
  • [4] Seamless Code Reuse with Source Code Corpus
    Yamamoto, Tetsuo
    Yoshida, Norihiro
    Higo, Yoshiki
    2013 20TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2013), VOL 2, 2013, : 31 - 36
  • [5] Program semantic analysis model for code reuse detection
    Guo, Xi
    Wang, Pan
    Tongxin Xuebao/Journal on Communications, 2024, 45 (12): : 179 - 196
  • [6] OntoPLC: Semantic Model of PLC Programs for Code Exchange and Software Reuse
    An, Yameng
    Qin, Feiwei
    Chen, Baiping
    Simon, Rene
    Wu, Huifeng
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (03) : 1702 - 1711
  • [7] Extraction of Library Update History Using Source Code Reuse Detection
    Jewmaidang, Kanyakorn
    Ishio, Takashi
    Ihara, Akinori
    Matsumoto, Kenichi
    Leelaprute, Pattara
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (03): : 799 - 802
  • [8] AN EVALUATION OF ADA SOURCE CODE REUSE
    THOMAS, WM
    DELIS, A
    BASILI, VR
    LECTURE NOTES IN COMPUTER SCIENCE, 1992, 603 : 80 - 91
  • [9] Leveraging source code search for reuse
    Happel, Hans-Joerg
    Schuster, Thomas
    Szulman, Peter
    HIGH CONFIDENCE SOFTWARE REUSE IN LARGE SYSTEMS, PROCEEDINGS, 2008, 5030 : 360 - 371
  • [10] Code reuse in open source software
    Haefliger, Stefan
    von Krogh, Georg
    Spaeth, Sebastian
    MANAGEMENT SCIENCE, 2008, 54 (01) : 180 - 193