Scalable and Systematic Detection of Buggy Inconsistencies in Source Code

被引:30
|
作者
Gabel, Mark [1 ]
Yang, Junfeng [2 ]
Yu, Yuan
Goldszmidt, Moises
Su, Zhendong [1 ]
机构
[1] Univ Calif Davis, Davis, CA 95616 USA
[2] Columbia Univ, New York, NY 10027 USA
关键词
Languages; Reliability; Algorithms; Experimentation;
D O I
10.1145/1932682.1869475
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Software developers often duplicate source code to replicate functionality. This practice can hinder the maintenance of a software project: bugs may arise when two identical code segments are edited inconsistently. This paper presents DejaVu, a highly scalable system for detecting these general syntactic inconsistency bugs. DejaVu operates in two phases. Given a target code base, a parallel inconsistent clone analysis first enumerates all groups of source code fragments that are similar but not identical. Next, an extensible buggy change analysis framework refines these results, separating each group of inconsistent fragments into a fine-grained set of inconsistent changes and classifying each as benign or buggy. On a 75+ million line pre-production commercial code base, DejaVu executed in under five hours and produced a report of over 8,000 potential bugs. Our analysis of a sizable random sample suggests with high likelihood that at this report contains at least 2,000 true bugs and 1,000 code smells. These bugs draw from a diverse class of software defects and are often simple to correct: syntactic inconsistencies both indicate problems and suggest solutions.
引用
收藏
页码:175 / 190
页数:16
相关论文
共 50 条
  • [31] Indexing source code and clone detection
    Tronicek, Zdenek
    INFORMATION AND SOFTWARE TECHNOLOGY, 2022, 144
  • [32] Systematic Debugging of Logical Errors in Source Code
    Ziemann, Felix
    Reuss, Florian
    PROCEEDINGS OF THE 19TH WIPSCE CONFERENCE IN PRIMARY AND SECONDARY COMPUTING EDUCATION RESEARCH, WIPSCE 2024, 2024,
  • [33] Source code metrics: A systematic mapping study
    Nunez-Varela, Alberto S.
    Perez-Gonzalez, Hector G.
    Martinez-Perez, Francisco E.
    Soubervielle-Montalvo, Carlos
    JOURNAL OF SYSTEMS AND SOFTWARE, 2017, 128 : 164 - 197
  • [34] Precise and Scalable Querying of Syntactical Source Code Patterns Using Sample Code Snippets and a Database
    Panchenko, Oleksandr
    Karstens, Jan
    Plattner, Hasso
    Zeier, Alexander
    2011 IEEE 19TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC), 2011, : 41 - 50
  • [35] Gapped Code Clone Detection with Lightweight Source Code Analysis
    Murakami, Hiroaki
    Hotta, Keisuke
    Higo, Yoshiki
    Igaki, Hiroshi
    Kusumoto, Shinji
    2013 IEEE 21ST INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC), 2013, : 93 - 102
  • [36] A SOURCE CODE AND NON-SOURCE CODE PLAGIARISM DETECTION RESEARCH FOR C PROGRAM
    Zhong Mei
    Li Yanchen
    Liu Dongsheng
    2011 3RD INTERNATIONAL CONFERENCE ON COMPUTER TECHNOLOGY AND DEVELOPMENT (ICCTD 2011), VOL 3, 2012, : 543 - 547
  • [37] A detection tool for code bad smells in java source code
    Gupta, Aakanshi
    Suri, Bharti
    Wadhwa, Bimlesh
    Advances in Intelligent Systems and Computing, 2021, 1086 : 479 - 488
  • [38] A Systematic Review on Code Clone Detection
    Ul Ain, Qurat
    Butt, Wasi Haider
    Anwar, Muhammad Waseem
    Azam, Farooque
    Maqbool, Bilal
    IEEE ACCESS, 2019, 7 : 86121 - 86144
  • [39] A systematic literature review on source code similarity measurement and clone detection: Techniques, applications, and challenges
    Zakeri-Nasrabadi, Morteza
    Parsa, Saeed
    Ramezani, Mohammad
    Roy, Chanchal
    Ekhtiarzadeh, Masoud
    JOURNAL OF SYSTEMS AND SOFTWARE, 2023, 204
  • [40] BinSequence: Fast, Accurate and Scalable Binary Code Reuse Detection
    Huang, He
    Youssef, Amr M.
    Debbabi, Mourad
    PROCEEDINGS OF THE 2017 ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (ASIA CCS'17), 2017, : 155 - 166