Discretization Methods for NBC in Effort Estimation: An Empirical Comparison based on ISBSG Projects

被引：0

作者：

Fernandez-Diego, Marta ^{[1
]}

Torralba-Martinez, Jose-Maria ^{[1
]}

机构：

[1] Univ Politecn Valencia, Dept Business Adm, Valencia 46022, Spain

来源：

PROCEEDINGS OF THE ACM-IEEE INTERNATIONAL SYMPOSIUM ON EMPIRICAL SOFTWARE ENGINEERING AND MEASUREMENT (ESEM'12) | 2012年

关键词：

Effort estimation; software projects; Bayesian networks; Naive Bayes Classifier; discretization methods; ISBSG; MODEL; PREDICTION;

D O I：

暂无

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Background: Bayesian networks have been applied in many fields, including effort estimation in software engineering. Even though there are Bayesian inference algorithms than can handle continuous variables, performance tends to be better when these variables are discretized that when they are assumed to follow a specific distribution. On the other hand, the choice of the discretization method and the number of discretized intervals may lead to significantly different estimating results. However, discretization issues are seldom mentioned in software engineering effort estimation models. Aim: This paper seeks to show that discretization issues are important in terms of prediction accuracy while building a Naive Bayes Classifier (NBC) for estimating software effort. Method: For this purpose, a NBC model has been developed for software effort estimation based on ISBSG projects applying different discretization schemes (equal width intervals, equal frequency intervals, and k-means clustering) and using different number of intervals. Results: Regarding the NBC model built, the estimation accuracy of equal frequency discretization is only improved by k-means clustering with respect to Pred(0.25), although it reflects better the original distribution. Conclusions: Further experimentation should determine the potential of clustering methods already highlighted in other fields.

引用

页码：103 / 106

页数：4

共 50 条

[31] Empirical evaluation of similarity-based missing data imputation for effort estimation
Tamura, Koichi
Toda, Koji
Tsunoda, Masateru
Monden, Akito
Matsumoto, Ken-Ichi
Kakimoto, Takeshi
Ohsugi, Naoki
Computer Software, 2009, 26 (03) : 44 - 55
[32] An effort prediction interval approach based on the empirical distribution of previous estimation accuracy
Jorgensen, M
Sjoberg, DIK
INFORMATION AND SOFTWARE TECHNOLOGY, 2003, 45 (03) : 123 - 136
[33] Review and Empirical Analysis of Machine Learning-Based Software Effort Estimation
Rahman, Mizanur
Sarwar, Hasan
Kader, MD. Abdul
Goncalves, Teresa
Tin, Ting Tin
IEEE ACCESS, 2024, 12 : 85661 - 85680
[34] VALUE OF EMPIRICAL-METHODS IN CURRICULUM RESEARCH - A DISCUSSION BASED ON PROJECTS
ZIECHMANN, J
ZEITSCHRIFT FUR PADAGOGIK, 1983, : 179 - 184
[35] Empirical comparison of several methods for parameter estimation of a shifted lognormal distribution
Muñoz D.F.
Ruiz C.
Guzman S.
Informacion Tecnologica, 2016, 27 (03): : 131 - 140
[36] Use Case-Based Effort Estimation Approaches: A Comparison Criteria
Kamal, Mohammed Wajahat
Ahmed, Moataz A.
El-Attar, Mohamed
SOFTWARE ENGINEERING AND COMPUTER SYSTEMS, PT 3, 2011, 181 : 735 - 754
[37] Empirical Assessment of Machine Learning Models for Effort Estimation of Web-based Applications
Satapathy, Shashank Mouli
Rath, Santanu Kumar
PROCEEDINGS OF THE 10TH INNOVATIONS IN SOFTWARE ENGINEERING CONFERENCE, 2017, : 74 - 84
[38] Empirical evaluation of software development effort estimation based on upper phase development activity
Tsunoda, Masateru
Toda, Koji
Fushida, Kyohei
Kamei, Yasutaka
Ubayashi, Naoyasu
Computer Software, 2014, 31 (02) : 129 - 143
[39] An empirical comparison of ensemble methods based on classification trees
Hamza, M
Larocque, D
JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2005, 75 (08) : 629 - 643
[40] Empirical comparison of structure-based pathway methods
Jaakkola, Maria K.
Elo, Laura L.
BRIEFINGS IN BIOINFORMATICS, 2016, 17 (02) : 336 - 345

← 1 2 3 4 5 →