Neighbor cleaning learning based cost-sensitive ensemble learning approach for software defect prediction

被引:2
|
作者
Li, Li [1 ]
Su, Renjia [1 ]
Zhao, Xin [1 ]
机构
[1] Northeast Forestry Univ, Sch Comp & Control Engn, Harbin, Peoples R China
来源
关键词
class imbalance; class overlap; cost-sensitive learning; machine learning; software defect prediction;
D O I
10.1002/cpe.8017
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The class imbalance problem in software defect prediction datasets leads to prediction results that are biased toward the majority class, and the class overlap problem leads to fuzzy boundaries for classification decisions, both of which affect the model's prediction performance on the dataset. A neighbor cleaning learning (NCL) is an effective technique for defect prediction. To solve the class overlap problem and class imbalance problem, the NCL-based cost-sensitive ensemble learning approach for software defect prediction (NCL_CSEL) model is proposed. First, the bootstrap resampled data are trained using the base classifier. Subsequently, multiple classifiers are integrated by a static ensemble to obtain the final classification results. As the base classifier, the Adaptive Boosting (AdaBoost) classifier combining NCL and cost-sensitive learning is proposed, and the class overlap problem and class imbalance problem are solved by balancing the proportion of overlap sample removal in NCL and the size of the cost factor in cost-sensitive learning. Specifically, the NCL algorithm is used to initialize the sample weights, while the cost-sensitive method is employed to update the sample weights. Experiments based on the NASA dataset and AEEEM dataset show that the defect prediction model can improve the bal value by approximately 7% and the AUC value by 9.5% when the NCL algorithm is added. NCL_CSEL can effectively solve the class imbalance problem and significantly improve the prediction performance compared with existing methods for solving the class imbalance problem.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Software defect prediction using cost-sensitive neural network
    Arar, Omer Faruk
    Ayan, Kursat
    APPLIED SOFT COMPUTING, 2015, 33 : 263 - 277
  • [22] Cost-Sensitive Feature Selection with Application in Software Defect Prediction
    Miao, Linsong
    Liu, Mingxia
    Zhang, Daoqiang
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 967 - 970
  • [23] Cost-Sensitive and Sparse Ladder Network for Software Defect Prediction
    Sun, Jing
    Ji, Yi-mu
    Liu, Shangdong
    Wu, Fei
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (05): : 1177 - 1180
  • [24] Cost-sensitive and ensemble-based prediction model for outsourced software project risk prediction
    Hu, Yong
    Feng, Bin
    Mo, Xizhu
    Zhang, Xiangzhou
    Ngai, E. W. T.
    Fan, Ming
    Liu, Mei
    DECISION SUPPORT SYSTEMS, 2015, 72 : 11 - 23
  • [25] An Ensemble Learning Approach for Software Defect Prediction in Developing Quality Software Product
    Saheed, Yakub Kayode
    Longe, Olumide
    Baba, Usman Ahmad
    Rakshit, Sandip
    Vajjhala, Narasimha Rao
    ADVANCES IN COMPUTING AND DATA SCIENCES, PT I, 2021, 1440 : 317 - 326
  • [26] Cost-Sensitive Learning
    Zhou, Zlii-Hua
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, MDAI 2011, 2011, 6820 : 17 - 18
  • [27] Software Defect Prediction Method Based on Clustering Ensemble Learning
    Tao, Hongwei
    Cao, Qiaoling
    Chen, Haoran
    Li, Yanting
    Niu, Xiaoxu
    Wang, Tao
    Geng, Zhenhao
    Shang, Songtao
    IET SOFTWARE, 2024, 2024
  • [28] Cost-Sensitive Metaheuristic Optimization-Based Neural Network with Ensemble Learning for Financial Distress Prediction
    Safi, Salah Al-Deen
    Castillo, Pedro A.
    Faris, Hossam
    APPLIED SCIENCES-BASEL, 2022, 12 (14):
  • [29] A Hierarchical Feature Ensemble Deep Learning Approach for Software Defect Prediction
    Zhang, Shenggang
    Jiang, Shujuan
    Yan, Yue
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2023, 33 (04) : 543 - 573
  • [30] Ensemble of Cost-Sensitive Hypernetworks for Class-Imbalance Learning
    Wang, Jin
    Huang, Ping-li
    Sun, Kai-wei
    Cao, Bao-lin
    Zhao, Rui
    2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 1883 - 1888