Improved prediction of software defects using ensemble machine learning techniques

被引:28
|
作者
Mehta, Sweta [1 ]
Patnaik, K. Sridhar [1 ]
机构
[1] Birla Inst Technol, Dept Comp Sci & Engn, Ranchi 835315, Bihar, India
来源
NEURAL COMPUTING & APPLICATIONS | 2021年 / 33卷 / 16期
关键词
Defect prediction; Dimension reduction; Data imbalance; Machine learning algorithms; XGBoost; Stacking ensemble classifier;
D O I
10.1007/s00521-021-05811-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Software testing process is a crucial part in software development. Generally the errors made by developers get fixed at a later stage of the software development process. This increases the impact of the defect. To prevent this, defects need to be predicted during the initial days of the software development, which in turn helps in efficient utilization of the testing resources. Defect prediction process involves classification of software modules into defect prone and non-defect prone. This paper aims to reduce the impact of two major issues faced during defect prediction, i.e., data imbalance and high dimensionality of the defect datasets. In this research work, various software metrics are evaluated using feature selection techniques such as Recursive Feature Elimination (RFE), Correlation-based feature selection, Lasso, Ridge, ElasticNet and Boruta. Logistic Regression, Decision Trees, K-nearest neighbor, Support Vector Machines and Ensemble Learning are some of the algorithms in machine learning that have been used in combination with the feature extraction and feature selection techniques for classifying the modules in software as defect prone and non-defect prone. The proposed model uses combination of Partial Least Square (PLS) Regression and RFE for dimension reduction which is further combined with Synthetic Minority Oversampling Technique due to the imbalanced nature of the used datasets. It has been observed that XGBoost and Stacking Ensemble technique gave best results for all the datasets with defect prediction accuracy more than 0.9 as compared to algorithms used in the research work.
引用
收藏
页码:10551 / 10562
页数:12
相关论文
共 50 条
  • [11] DIABETES TWITTER ANALYSIS USING IMPROVED ENSEMBLE MACHINE LEARNING TECHNIQUES
    Prabha, V. Diviya
    Rathipriya, R.
    ADVANCES AND APPLICATIONS IN MATHEMATICAL SCIENCES, 2021, 21 (01): : 241 - 250
  • [12] Enhancing software defect prediction: a framework with improved feature selection and ensemble machine learning
    Ali, Misbah
    Mazhar, Tehseen
    Al-Rasheed, Amal
    Shahzad, Tariq
    Ghadi, Yazeed Yasin
    Khan, Muhammad Amir
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [13] Enhanced slope stability prediction using ensemble machine learning techniques
    Yadav, Devendra Kumar
    Chattopadhyay, Swarup
    Tripathy, Debi Prasad
    Mishra, Pragyan
    Singh, Pritiranjan
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [14] Performance prediction of impact hammer using ensemble machine learning techniques
    Ocak, Ibrahim
    Seker, Sadi Evren
    Rostami, Jamal
    TUNNELLING AND UNDERGROUND SPACE TECHNOLOGY, 2018, 80 : 269 - 276
  • [16] Game State Prediction with Ensemble of Machine Learning Techniques
    Woh, Sange-Myeong
    Lee, Jee-Hyong
    2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, : 89 - 92
  • [17] An empirical study of software reliability prediction using machine learning techniques
    Kumar, Pradeep
    Singh, Yogesh
    International Journal of System Assurance Engineering and Management, 2012, 3 (03) : 194 - 208
  • [18] Towards Effective Software Defect Prediction Using Machine Learning Techniques
    Akshat Pandey
    Akshay Jadhav
    SN Computer Science, 5 (8)
  • [19] Field scale wheat yield prediction using ensemble machine learning techniques
    Gawdiya, Sandeep
    Kumar, Dinesh
    Ahmed, Bulbul
    Sharma, Ramandeep Kumar
    Das, Pankaj
    Choudhary, Manoj
    Mattar, Mohamed A.
    SMART AGRICULTURAL TECHNOLOGY, 2024, 9
  • [20] Reliable prediction of software defects using Shapley interpretable machine learning models
    Al-Smadi, Yazan
    Eshtay, Mohammed
    Al-Qerem, Ahmad
    Nashwan, Shadi
    Ouda, Osama
    Abd El-Aziz, A. A.
    EGYPTIAN INFORMATICS JOURNAL, 2023, 24 (03)