Identifying Functionally Similar Code in Complex Codebases

被引:0
|
作者
Su, Fang-Hsiang [1 ]
Bell, Jonathan [1 ]
Kaiser, Gail [1 ]
Sethumadhavan, Simha [1 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
关键词
I/O behavior; dynamic analysis; code clone detection; data flow analysis; patterns;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Identifying similar code in software systems can assist many software engineering tasks such as program understanding and software refactoring. While most approaches focus on identifying code that looks alike, some techniques aim at detecting code that functions alike. Detecting these functional clones - code that functions alike - in object oriented languages remains an open question because of the difficulty in exposing and comparing programs' functionality effectively. We propose a novel technique, In-Vivo Clone Detection, that detects functional clones in arbitrary programs by identifying and mining their inputs and outputs. The key insight is to use existing workloads to execute programs and then measure functional similarities between programs based on their inputs and outputs, which mitigates the problems in object oriented languages reported by prior work. We implement such technique in our system, HitoshiIO, which is open source and freely available. Our experimental results show that HitoshiIO detects more than 800 functional clones across a corpus of 118 projects. In a random sample of the detected clones, HitoshiIO achieves 68+% true positive rate with only 15% false positive rate.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] ARE AUTECOLOGICALLY SIMILAR SPECIES ALSO FUNCTIONALLY SIMILAR - A TEST IN POND COMMUNITIES
    HARRIS, PM
    ECOLOGY, 1995, 76 (02) : 544 - 552
  • [22] Identifying functionally distinctive and threatened species
    Pavoine, Sandrine
    Ricotta, Carlo
    BIOLOGICAL CONSERVATION, 2023, 284
  • [23] On identifying functionally untestable transition faults
    Liu, X
    Hsiao, MS
    NINTH IEEE INTERNATIONAL HIGH-LEVEL DESIGN VALIDATION AND TEST WORKSHOP, PROCEEDINGS, 2004, : 121 - 126
  • [24] DEP domains: structurally similar but functionally different
    Consonni, Sarah V.
    Maurice, Madelon M.
    Bos, Johannes L.
    NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2014, 15 (05) : 357 - 362
  • [25] THE DEVELOPMENT OF FUNCTIONALLY SIMILAR AND DISSIMILAR OPERATIONS OF EXCLUSION
    HUBBSTAIT, L
    CHILD DEVELOPMENT, 1986, 57 (04) : 934 - 941
  • [26] DEP domains: structurally similar but functionally different
    Sarah V. Consonni
    Madelon M. Maurice
    Johannes L. Bos
    Nature Reviews Molecular Cell Biology, 2014, 15 : 357 - 362
  • [27] Code Comments: A Way of Identifying Similarities in the Source Code
    Folea, Rares
    Slusanschi, Emil
    MATHEMATICS, 2024, 12 (07)
  • [28] Identifying Code Clones with RefactorErl
    Fordos, Viktoria
    Toth, Melinda
    ACTA CYBERNETICA, 2016, 22 (03): : 553 - 571
  • [29] The minimum identifying code graphs
    Raspaud, Andre
    Tong, Li-Da
    DISCRETE APPLIED MATHEMATICS, 2012, 160 (09) : 1385 - 1389
  • [30] Identifying code for directed graph
    Xu, Yi-Chun
    Xiao, Ren-Bin
    SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 2, PROCEEDINGS, 2007, : 97 - +