Discretization Methods for NBC in Effort Estimation: An Empirical Comparison based on ISBSG Projects

被引:0
|
作者
Fernandez-Diego, Marta [1 ]
Torralba-Martinez, Jose-Maria [1 ]
机构
[1] Univ Politecn Valencia, Dept Business Adm, Valencia 46022, Spain
关键词
Effort estimation; software projects; Bayesian networks; Naive Bayes Classifier; discretization methods; ISBSG; MODEL; PREDICTION;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Background: Bayesian networks have been applied in many fields, including effort estimation in software engineering. Even though there are Bayesian inference algorithms than can handle continuous variables, performance tends to be better when these variables are discretized that when they are assumed to follow a specific distribution. On the other hand, the choice of the discretization method and the number of discretized intervals may lead to significantly different estimating results. However, discretization issues are seldom mentioned in software engineering effort estimation models. Aim: This paper seeks to show that discretization issues are important in terms of prediction accuracy while building a Naive Bayes Classifier (NBC) for estimating software effort. Method: For this purpose, a NBC model has been developed for software effort estimation based on ISBSG projects applying different discretization schemes (equal width intervals, equal frequency intervals, and k-means clustering) and using different number of intervals. Results: Regarding the NBC model built, the estimation accuracy of equal frequency discretization is only improved by k-means clustering with respect to Pred(0.25), although it reflects better the original distribution. Conclusions: Further experimentation should determine the potential of clustering methods already highlighted in other fields.
引用
收藏
页码:103 / 106
页数:4
相关论文
共 50 条
  • [31] Empirical evaluation of similarity-based missing data imputation for effort estimation
    Tamura, Koichi
    Toda, Koji
    Tsunoda, Masateru
    Monden, Akito
    Matsumoto, Ken-Ichi
    Kakimoto, Takeshi
    Ohsugi, Naoki
    Computer Software, 2009, 26 (03) : 44 - 55
  • [32] An effort prediction interval approach based on the empirical distribution of previous estimation accuracy
    Jorgensen, M
    Sjoberg, DIK
    INFORMATION AND SOFTWARE TECHNOLOGY, 2003, 45 (03) : 123 - 136
  • [33] Review and Empirical Analysis of Machine Learning-Based Software Effort Estimation
    Rahman, Mizanur
    Sarwar, Hasan
    Kader, MD. Abdul
    Goncalves, Teresa
    Tin, Ting Tin
    IEEE ACCESS, 2024, 12 : 85661 - 85680
  • [34] VALUE OF EMPIRICAL-METHODS IN CURRICULUM RESEARCH - A DISCUSSION BASED ON PROJECTS
    ZIECHMANN, J
    ZEITSCHRIFT FUR PADAGOGIK, 1983, : 179 - 184
  • [35] Empirical comparison of several methods for parameter estimation of a shifted lognormal distribution
    Muñoz D.F.
    Ruiz C.
    Guzman S.
    Informacion Tecnologica, 2016, 27 (03): : 131 - 140
  • [36] Use Case-Based Effort Estimation Approaches: A Comparison Criteria
    Kamal, Mohammed Wajahat
    Ahmed, Moataz A.
    El-Attar, Mohamed
    SOFTWARE ENGINEERING AND COMPUTER SYSTEMS, PT 3, 2011, 181 : 735 - 754
  • [37] Empirical Assessment of Machine Learning Models for Effort Estimation of Web-based Applications
    Satapathy, Shashank Mouli
    Rath, Santanu Kumar
    PROCEEDINGS OF THE 10TH INNOVATIONS IN SOFTWARE ENGINEERING CONFERENCE, 2017, : 74 - 84
  • [38] Empirical evaluation of software development effort estimation based on upper phase development activity
    Tsunoda, Masateru
    Toda, Koji
    Fushida, Kyohei
    Kamei, Yasutaka
    Ubayashi, Naoyasu
    Computer Software, 2014, 31 (02) : 129 - 143
  • [39] An empirical comparison of ensemble methods based on classification trees
    Hamza, M
    Larocque, D
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2005, 75 (08) : 629 - 643
  • [40] Empirical comparison of structure-based pathway methods
    Jaakkola, Maria K.
    Elo, Laura L.
    BRIEFINGS IN BIOINFORMATICS, 2016, 17 (02) : 336 - 345