Evaluating defect prediction approaches: a benchmark and an extensive comparison

被引：0

作者：

Marco D’Ambros

Michele Lanza

Romain Robbes

机构：

[1] University of Lugano,REVEAL @ Faculty of Informatics

[2] University of Chile,PLEIAD Lab @ Computer Science Department (DCC)

来源：

Empirical Software Engineering | 2012年 / 17卷

关键词：

Defect prediction; Source code metrics; Change metrics;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Reliably predicting software defects is one of the holy grails of software engineering. Researchers have devised and implemented a plethora of defect/bug prediction approaches varying in terms of accuracy, complexity and the input data they require. However, the absence of an established benchmark makes it hard, if not impossible, to compare approaches. We present a benchmark for defect prediction, in the form of a publicly available dataset consisting of several software systems, and provide an extensive comparison of well-known bug prediction approaches, together with novel approaches we devised. We evaluate the performance of the approaches using different performance indicators: classification of entities as defect-prone or not, ranking of the entities, with and without taking into account the effort to review an entity. We performed three sets of experiments aimed at (1) comparing the approaches across different systems, (2) testing whether the differences in performance are statistically significant, and (3) investigating the stability of approaches across different learners. Our results indicate that, while some approaches perform better than others in a statistically significant manner, external validity in defect prediction is still an open problem, as generalizing results to different contexts/learners proved to be a partially unsuccessful endeavor.

引用

页码：531 / 577

页数：46

共 50 条

[1] Evaluating defect prediction approaches: a benchmark and an extensive comparison
D'Ambros, Marco
Lanza, Michele
Robbes, Romain
EMPIRICAL SOFTWARE ENGINEERING, 2012, 17 (4-5) : 531 - 577
[2] A Comparative Study to Benchmark Cross-project Defect Prediction Approaches
Herbold, Steffen
Trautsch, Alexander
Grabowski, Jens
PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2018, : 1063 - 1063
[3] A Comparative Study to Benchmark Cross-Project Defect Prediction Approaches
Herbold, Steffen
Trautsch, Alexander
Grabowski, Jens
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2018, 44 (09) : 811 - 833
[4] Evaluating benchmark subsetting approaches
Yi, Joshua J.
Sendag, Resit
Eeckhout, Lieven
Joshi, Ajay
Lilja, David J.
Johns, Lizy K.
PROCEEDINGS OF THE IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION, 2006, : 93 - +
[5] Benchmark for Evaluating Pedestrian Action Prediction
Kotseruba, Iuliia
Rasouli, Amir
Tsotsos, John K.
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1257 - 1267
[6] Evaluating Defect Prediction Approaches Using A Massive Set of Metrics: An Empirical Study
Xuan, Xiao
Lo, David
Xia, Xin
Tian, Yuan
30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II, 2015, : 1644 - 1647
[7] A comprehensive computational benchmark for evaluating deep learning-based protein function prediction approaches
Wang, Wenkang
Shuai, Yunyan
Yang, Qiurong
Zhang, Fuhao
Zeng, Min
Li, Min
BRIEFINGS IN BIOINFORMATICS, 2024, 25 (02)
[8] A Comparison of Semi-Supervised Classification Approaches for Software Defect Prediction
Catal, Cagatay
JOURNAL OF INTELLIGENT SYSTEMS, 2014, 23 (01) : 75 - 82
[9] Comparison of Selected Portfolio Approaches with Benchmark
Nedela, David
38TH INTERNATIONAL CONFERENCE ON MATHEMATICAL METHODS IN ECONOMICS (MME 2020), 2020, : 389 - 395
[10] Building a Benchmark for Evaluating Link Prediction Methods
Xiao, Junyan
Wang, Peng
Meng, Yue
2018 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2018, : 1065 - 1070

← 1 2 3 4 5 →