Review and Empirical Analysis of Machine Learning-Based Software Effort Estimation

被引:1
|
作者
Rahman, Mizanur [1 ]
Sarwar, Hasan [2 ]
Kader, MD. Abdul [3 ]
Goncalves, Teresa [4 ]
Tin, Ting Tin [5 ]
机构
[1] Western Illinois Univ, Sch Comp Sci, Macomb, IL 61455 USA
[2] United Int Univ, Dept Comp Sci & Engn, Dhaka 1212, Bangladesh
[3] Univ Malaysia Pahang Al Sultan Abdullah, Fac Comp, Pekan 26600, Malaysia
[4] Univ Evora, Dept Informat, P-7004516 Evora, Portugal
[5] INTI Int Univ, Fac Data Sci & Informat Technol, Nilai 71800, Malaysia
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Estimation; Machine learning algorithms; Software reliability; Software algorithms; Research and development; Software development management; Linear regression; Support vector machines; Random forests; Software effort estimation; software development efforts estimation; linear regression; support vector machine; random forest; LASSO; KNN; R&D investment;
D O I
10.1109/ACCESS.2024.3404879
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The average software company spends a huge amount of its revenue on Research and Development (R&D) for how to deliver software on time. Accurate software effort estimation is critical for successful project planning, resource allocation, and on-time delivery within budget for sustainable software development. However, both overestimation and underestimation can pose significant challenges, highlighting the need for continuous improvement in estimation techniques. This study reviews recent machine learning approaches employed to enhance the accuracy of software effort estimation (SEE), focusing on research published between 2020 and 2023. The literature review employed a systematic approach to identify relevant research on machine learning techniques for SEE. Additionally, comparative experiments were conducted using five commonly employed Machine Learning (ML) methods: K-Nearest Neighbor, Support Vector Machine, Random Forest, Logistic Regression, and LASSO Regression. The performance of these techniques was evaluated using five widely adopted accuracy metrics: Mean Squared Error (MSE), Mean Magnitude of Relative Error (MMRE), R-squared, Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE). The evaluation was carried out on seven benchmark datasets: Albrecht, Desharnais, China, Kemerer, Mayazaki94, Maxwell, and COCOMO, which are publicly available and extensively used in SEE research. By carefully reviewing study quality, analyzing results across the literature, and rigorously evaluating experimental outcomes, clear conclusions were drawn about the most promising techniques for achieving state-of-the-art accuracy in estimating software effort. This study makes three key contributions to the field: firstly, it furnishes a thorough overview of recent machine learning research in software effort estimation (SEE); secondly, it provides data-driven guidance for researchers and practitioners to select optimal methods for accurate effort estimation; and thirdly, it demonstrates the performance of publicly available datasets through experimental analysis. Enhanced estimation supports the development of better predictive models for software project time, cost, and staffing needs. The findings aim to guide future research directions and tool development toward the most accurate machine learning approaches for modelling software development effort, costs, and delivery schedules, ultimately contributing to more efficient and cost-effective software projects.
引用
收藏
页码:85661 / 85680
页数:20
相关论文
共 50 条
  • [41] Software-defined Software: A Perspective of Machine Learning-based Software Production
    Lee, Rubao
    Wang, Hao
    Zhang, Xiaodong
    2018 IEEE 38TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2018, : 1270 - 1275
  • [42] EMPIRICAL COMPARISON AND ANALYSIS OF MACHINE LEARNING-BASED APPROACHES FOR DRUGGABLE PROTEIN IDENTIFICATION
    Shoombuatong, Watshara
    Schaduangrat, Nalini
    Nikom, Jaru
    EXCLI JOURNAL, 2023, 22 : 915 - 927
  • [43] Review of Federated Learning and Machine Learning-Based Methods for Medical Image Analysis
    Hernandez-Cruz, Netzahualcoyotl
    Saha, Pramit
    Sarker, Md Mostafa Kamal
    Noble, J. Alison
    BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (09)
  • [44] Software effort estimation using machine learning techniques with robust confidence intervals
    Braga, Petronio L.
    Oliveira, Adriano L. I.
    Meira, Silvio R. L.
    19TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL I, PROCEEDINGS, 2007, : 181 - +
  • [45] Effort Estimation for Embedded Software Development Projects by Combining Machine Learning with Classification
    Iwata, Kazunori
    Nakashima, Toyoshiro
    Anan, Yoshiyuki
    Ishii, Naohiro
    2016 4TH INTL CONF ON APPLIED COMPUTING AND INFORMATION TECHNOLOGY/3RD INTL CONF ON COMPUTATIONAL SCIENCE/INTELLIGENCE AND APPLIED INFORMATICS/1ST INTL CONF ON BIG DATA, CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (ACIT-CSII-BCD), 2016, : 265 - 270
  • [46] Natural Language Processing and Machine Learning Methods for Software Development Effort Estimation
    Ionescu, Vlad-Sebastian
    Demian, Horia
    Czibula, Istvan-Gergely
    STUDIES IN INFORMATICS AND CONTROL, 2017, 26 (02): : 219 - 228
  • [47] An effective approach for software project effort and duration estimation with machine learning algorithms
    Pospieszny, Przemyslaw
    Czarnacka-Chrobot, Beata
    Kobylinski, Andrzej
    JOURNAL OF SYSTEMS AND SOFTWARE, 2018, 137 : 184 - 196
  • [48] Grey learning based software stage-effort estimation
    Wang, Yong
    Song, Qin-Bao
    Shen, Jun-Yi
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 1470 - 1475
  • [49] Machine Learning-Based Software Defect Prediction for Mobile Applications: A Systematic Literature Review
    Jorayeva, Manzura
    Akbulut, Akhan
    Catal, Cagatay
    Mishra, Alok
    SENSORS, 2022, 22 (07)
  • [50] A Systematic Literature Review on Explainability for Machine/Deep Learning-based Software Engineering Research
    Cao, Sicong
    Sun, Xiaobing
    Widyasari, Ratnadira
    Lo, David
    Wu, Xiaoxue
    Bo, Lili
    Zhang, Jiale
    Li, Bin
    Liu, Wei
    Wu, Di
    Chen, Yixin
    arXiv,