Identifying Functionally Similar Code in Complex Codebases

被引:0
|
作者
Su, Fang-Hsiang [1 ]
Bell, Jonathan [1 ]
Kaiser, Gail [1 ]
Sethumadhavan, Simha [1 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
关键词
I/O behavior; dynamic analysis; code clone detection; data flow analysis; patterns;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Identifying similar code in software systems can assist many software engineering tasks such as program understanding and software refactoring. While most approaches focus on identifying code that looks alike, some techniques aim at detecting code that functions alike. Detecting these functional clones - code that functions alike - in object oriented languages remains an open question because of the difficulty in exposing and comparing programs' functionality effectively. We propose a novel technique, In-Vivo Clone Detection, that detects functional clones in arbitrary programs by identifying and mining their inputs and outputs. The key insight is to use existing workloads to execute programs and then measure functional similarities between programs based on their inputs and outputs, which mitigates the problems in object oriented languages reported by prior work. We implement such technique in our system, HitoshiIO, which is open source and freely available. Our experimental results show that HitoshiIO detects more than 800 functional clones across a corpus of 118 projects. In a random sample of the detected clones, HitoshiIO achieves 68+% true positive rate with only 15% false positive rate.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Are There Functionally Similar Code Clones in Practice?
    Kaefer, Verena
    Wagner, Stefan
    Koschke, Rainer
    2018 IEEE 12TH INTERNATIONAL WORKSHOP ON SOFTWARE CLONES (IWSC), 2018, : 2 - 8
  • [2] Challenges of the Dynamic Detection of Functionally Similar Code Fragments
    Deissenboeck, Florian
    Heinemann, Lars
    Hummel, Benjamin
    Wagner, Stefan
    2012 16TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING (CSMR), 2012, : 297 - +
  • [3] Detecting Functionally Similar Code within the Same Project
    Tajima, Ryo
    Nagura, Masataka
    Takada, Shingo
    2018 IEEE 12TH INTERNATIONAL WORKSHOP ON SOFTWARE CLONES (IWSC), 2018, : 51 - 57
  • [4] Identifying similar code with program dependence graphs
    Krinke, J
    EIGHTH WORKING CONFERENCE ON REVERSE ENGINEERING, PROCEEDINGS, 2001, : 301 - 309
  • [5] Detecting Oxbow Code in Erlang Codebases with the Highest Degree of Certainty
    Benavides Rodriguez, Fernando
    Castro, Laura M.
    ERLANG '21: PROCEEDINGS OF THE 20TH ACM SIGPLAN INTERNATIONAL WORKSHOP ON ERLANG, 2021, : 28 - 40
  • [6] Finding Bugs Using Your Own Code: Detecting Functionally-similar yet Inconsistent Code
    Ahmadi, Mansour
    Farkhani, Reza Mirzazade
    Williams, Ryan
    Lu, Long
    PROCEEDINGS OF THE 30TH USENIX SECURITY SYMPOSIUM, 2021, : 2025 - 2040
  • [7] Identifying Error Code Misuses in Complex System
    Tang, Wensheng
    PROCEEDINGS OF THE 28TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS (ISSTA '19), 2019, : 428 - 432
  • [8] How are functionally similar code clones syntactically different? An empirical study and a benchmark
    Wagner, Stefan
    Abdulkhaleq, Asim
    Bogicevic, Ivan
    Ostberg, Jan-Peter
    Ramadani, Jasmin
    PEERJ COMPUTER SCIENCE, 2016,
  • [9] Code Red: The Business Impact of Code Quality - A Quantitative Study of 39 Proprietary Production Codebases
    Tornhill, Adam
    Borg, Markus
    INTERNATIONAL CONFERENCE ON TECHNICAL DEBT 2022 (TECHDEBT 2022), 2022, : 11 - 20
  • [10] Simultaneous Clustering and Feature Weighting Using Multiobjective Optimization for Identifying Functionally Similar miRNAs
    Saha, Sriparna
    Acharya, Sudipta
    Kavya, K.
    Miriyala, Saisree
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2018, 22 (05) : 1684 - 1690