Improved prediction of software defects using ensemble machine learning techniques

被引:28
|
作者
Mehta, Sweta [1 ]
Patnaik, K. Sridhar [1 ]
机构
[1] Birla Inst Technol, Dept Comp Sci & Engn, Ranchi 835315, Bihar, India
来源
NEURAL COMPUTING & APPLICATIONS | 2021年 / 33卷 / 16期
关键词
Defect prediction; Dimension reduction; Data imbalance; Machine learning algorithms; XGBoost; Stacking ensemble classifier;
D O I
10.1007/s00521-021-05811-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Software testing process is a crucial part in software development. Generally the errors made by developers get fixed at a later stage of the software development process. This increases the impact of the defect. To prevent this, defects need to be predicted during the initial days of the software development, which in turn helps in efficient utilization of the testing resources. Defect prediction process involves classification of software modules into defect prone and non-defect prone. This paper aims to reduce the impact of two major issues faced during defect prediction, i.e., data imbalance and high dimensionality of the defect datasets. In this research work, various software metrics are evaluated using feature selection techniques such as Recursive Feature Elimination (RFE), Correlation-based feature selection, Lasso, Ridge, ElasticNet and Boruta. Logistic Regression, Decision Trees, K-nearest neighbor, Support Vector Machines and Ensemble Learning are some of the algorithms in machine learning that have been used in combination with the feature extraction and feature selection techniques for classifying the modules in software as defect prone and non-defect prone. The proposed model uses combination of Partial Least Square (PLS) Regression and RFE for dimension reduction which is further combined with Synthetic Minority Oversampling Technique due to the imbalanced nature of the used datasets. It has been observed that XGBoost and Stacking Ensemble technique gave best results for all the datasets with defect prediction accuracy more than 0.9 as compared to algorithms used in the research work.
引用
收藏
页码:10551 / 10562
页数:12
相关论文
共 50 条
  • [21] Prediction of software defects using deep learning with improved cuckoo search algorithm
    Badvath, Dhanalaxmi
    Miriyala, Aruna Safali
    Gunupudi, Sai Chaitanya Kumar
    Kuricheti, Parish Venkata Kumar
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (26):
  • [22] Optimized ensemble machine learning model for software bugs prediction
    Femi Johnson
    Olayiwola Oluwatobi
    Olusegun Folorunso
    Alomaja Victor Ojumu
    Alatishe Quadri
    Innovations in Systems and Software Engineering, 2023, 19 : 91 - 101
  • [23] Optimized ensemble machine learning model for software bugs prediction
    Johnson, Femi
    Oluwatobi, Olayiwola
    Folorunso, Olusegun
    Ojumu, Alomaja Victor
    Quadri, Alatishe
    INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2023, 19 (01) : 91 - 101
  • [24] Software Defect Prediction: A Machine Learning Approach with Voting Ensemble
    Mosquera, Marcela
    Hurtado, Remigio
    PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, ICICT 2024, VOL 3, 2024, 1013 : 585 - 595
  • [25] Machine learning techniques for software testing effort prediction
    Lopez-Martin, Cuauhtemoc
    SOFTWARE QUALITY JOURNAL, 2022, 30 (01) : 65 - 100
  • [26] Machine learning techniques for software testing effort prediction
    Cuauhtémoc López-Martín
    Software Quality Journal, 2022, 30 : 65 - 100
  • [27] Comparison of Machine Learning Techniques for Software Quality Prediction
    Goyal, Somya
    INTERNATIONAL JOURNAL OF KNOWLEDGE AND SYSTEMS SCIENCE, 2020, 11 (02) : 20 - 40
  • [28] An empirical framework for defect prediction using machine learning techniques with Android software
    Malhotra, Ruchika
    APPLIED SOFT COMPUTING, 2016, 49 : 1034 - 1050
  • [29] Ensemble Machine Learning Techniques for Attack Prediction in NIDS Environment
    Reddy T.S.
    Sathya R.
    Iraqi Journal for Computer Science and Mathematics, 2022, 3 (02): : 78 - 82
  • [30] Software Code Analysis using Ensemble Learning Techniques
    Aggarwal, Simran
    PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION SCIENCE AND SYSTEM, AISS 2019, 2019,