An Empirical Study of the Impact of Class Overlap on the Performance and Interpretability of Cross-Version Defect Prediction

被引:0
|
作者
Han, Hui [1 ]
Yu, Qiao [1 ]
Zhu, Yi [1 ]
Cheng, Shengyi [1 ]
Zhang, Yu [1 ]
机构
[1] Jiangsu Normal Univ, Sch Comp Sci & Technol, Xuzhou 221116, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Software defect prediction; cross-version defect prediction; class overlap;
D O I
10.1142/S0218194024500414
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The class overlap problem refers to instances from different categories heavily overlapping in the feature space. This issue is one of the challenges in improving the performance of software defect prediction (SDP). Currently, the studies on the impact of class overlap on SDP mainly focused on within-project defect prediction and cross-project defect prediction. Moreover, the existing class overlap instances cleaning methods are not suitable for cross-version defect prediction. In this paper, we propose a class overlap instances cleaning method based on the Ratio of K-nearest neighbors with the Same Label (RKSL). This method removes instances with the abnormal neighbor ratio in the training set. Based on the RKSL method, we investigate the impact of class overlap on the performance and interpretability of the cross-version defect prediction model. The experiment results show that class overlap can affect the performance of cross-version defect prediction models significantly. The RKSL method can handle the class overlap problem in defect datasets, but it may impact the interpretability of models. Through the analysis of feature changes, we consider that class overlap instances cleaning can assist models in identifying more important features.
引用
收藏
页码:1895 / 1918
页数:24
相关论文
共 50 条
  • [1] Empirical Study: Are Complex Network Features Suitable for Cross-Version Software Defect Prediction?
    Gao, Houleng
    Lu, Minyan
    Pan, Cong
    Xu, Biao
    PROCEEDINGS OF 2019 IEEE 10TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2019), 2019, : 206 - 210
  • [2] Active learning empirical research on cross-version software defect prediction datasets
    Li F.
    Qu Y.
    Ji J.
    Zhang D.
    Li L.
    International Journal of Performability Engineering, 2020, 16 (04) : 609 - 617
  • [3] Evolutionary Measures for Object-oriented Projects and Impact on the Performance of Cross-version Defect Prediction
    Yu, Qiao
    Zhu, Yi
    Han, Hui
    Zhao, Yu
    Jiang, Shujuan
    Qian, Junyan
    13TH ASIA-PACIFIC SYMPOSIUM ON INTERNETWARE, INTERNETWARE 2022, 2022, : 192 - 201
  • [4] Multi-objective cross-version defect prediction
    Shukla, Swapnil
    Radhakrishnan, T.
    Muthukumaran, K.
    Neti, Lalita Bhanu Murthy
    SOFT COMPUTING, 2018, 22 (06) : 1959 - 1980
  • [5] Multi-objective cross-version defect prediction
    Swapnil Shukla
    T. Radhakrishnan
    K. Muthukumaran
    Lalita Bhanu Murthy Neti
    Soft Computing, 2018, 22 : 1959 - 1980
  • [6] Ridge and Lasso Regression Models for Cross-Version Defect Prediction
    Yang, Xiaoxing
    Wen, Wushao
    IEEE TRANSACTIONS ON RELIABILITY, 2018, 67 (03) : 885 - 896
  • [7] A Drift Propensity Detection Technique to Improve the Performance for Cross-Version Software Defect Prediction
    Kabir, Md Alamgir
    Keung, Jacky W.
    Bennin, Kwabena E.
    Zhang, Miao
    2020 IEEE 44TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2020), 2020, : 882 - 891
  • [8] Cross-Version Defect Prediction using Cross-Project Defect Prediction Approaches: Does it work?
    Amasaki, Sousuke
    PROMISE'18: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON PREDICTIVE MODELS AND DATA ANALYTICS IN SOFTWARE ENGINEERING, 2018, : 32 - 41
  • [9] Using evolutionary process for cross-version software defect prediction
    Li Y.
    Liu Z.
    Zhang H.
    International Journal of Performability Engineering, 2019, 15 (09): : 2484 - 2493
  • [10] Connecting historical changes for cross-version software defect prediction
    Bai, Xue
    Zhou, Hua
    Yang, Hongji
    Wang, Dong
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2020, 63 (04) : 371 - 383