Using Class Imbalance Learning for Software Defect Prediction

被引:389
|
作者
Wang, Shuo [1 ]
Yao, Xin [1 ]
机构
[1] Univ Birmingham, Sch Comp Sci, CERCIA, Birmingham B15 2TT, W Midlands, England
基金
英国工程与自然科学研究理事会;
关键词
Class imbalance learning; ensemble learning; negative correlation learning; software defect prediction; STATIC CODE ATTRIBUTES; NEURAL-NETWORKS; MACHINE;
D O I
10.1109/TR.2013.2259203
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To facilitate software testing, and save testing costs, a wide range of machine learning methods have been studied to predict defects in software modules. Unfortunately, the imbalanced nature of this type of data increases the learning difficulty of such a task. Class imbalance learning specializes in tackling classification problems with imbalanced distributions, which could be helpful for defect prediction, but has not been investigated in depth so far. In this paper, we study the issue of if and how class imbalance learning methods can benefit software defect prediction with the aim of finding better solutions. We investigate different types of class imbalance learning methods, including resampling techniques, threshold moving, and ensemble algorithms. Among those methods we studied, AdaBoost.NC shows the best overall performance in terms of the measures including balance, G-mean, and Area Under the Curve (AUC). To further improve the performance of the algorithm, and facilitate its use in software defect prediction, we propose a dynamic version of AdaBoost. NC, which adjusts its parameter automatically during training. Without the need to pre-define any parameters, it is shown to be more effective and efficient than the original AdaBoost. NC.
引用
收藏
页码:434 / 443
页数:10
相关论文
共 50 条
  • [21] Hellinger Net: A Hybrid Imbalance Learning Model to Improve Software Defect Prediction
    Chakraborty, Tanujit
    Chakraborty, Ashis Kumar
    IEEE TRANSACTIONS ON RELIABILITY, 2021, 70 (02) : 481 - 494
  • [22] Support Vector based Oversampling Technique for Handling Class Imbalance in Software Defect Prediction
    Malhotra, Ruchika
    Agrawal, Vaibhav
    Pal, Vedansh
    Agarwal, Tushar
    2021 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2021), 2021, : 1078 - 1083
  • [23] An empirical study toward dealing with noise and class imbalance issues in software defect prediction
    Pandey, Sushant Kumar
    Tripathi, Anil Kumar
    SOFT COMPUTING, 2021, 25 (21) : 13465 - 13492
  • [24] Class Imbalance Evolution and Verification Latency in Just-in-Time Software Defect Prediction
    Cabral, George G.
    Minku, Leandro L.
    Shihab, Emad
    Mujahid, Suhaib
    2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2019), 2019, : 666 - 676
  • [25] An empirical study toward dealing with noise and class imbalance issues in software defect prediction
    Sushant Kumar Pandey
    Anil Kumar Tripathi
    Soft Computing, 2021, 25 : 13465 - 13492
  • [26] Software defect prediction using a cost sensitive decision forest and voting, and a potential solution to the class imbalance problem
    Siers, Michael J.
    Islam, Md Zahidul
    INFORMATION SYSTEMS, 2015, 51 : 62 - 71
  • [27] Cross-project defect prediction using data sampling for class imbalance learning: an empirical study
    Goel, Lipika
    Sharma, Mayank
    Khatri, Sunil Kumar
    Damodaran, D.
    INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2019, : 130 - 143
  • [28] Boosting Software Fault Prediction: Addressing Class Imbalance With Enhanced Ensemble Learning
    Alsorory, Hanan Sharif
    Alshraideh, Mohammad
    APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING, 2024, 2024
  • [29] Software defect prediction using learning to rank approach
    Nassif, Ali Bou
    Talib, Manar Abu
    Azzeh, Mohammad
    Alzaabi, Shaikha
    Khanfar, Rawan
    Kharsa, Ruba
    Angelis, Lefteris
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [30] Performing Software Defect Prediction Using Deep Learning
    Gurung, Saksham
    Communications in Computer and Information Science, 2022, 1697 CCIS : 319 - 331