An Empirical Study of the Impact of Class Overlap on the Performance and Interpretability of Cross-Version Defect Prediction

被引:0
|
作者
Han, Hui [1 ]
Yu, Qiao [1 ]
Zhu, Yi [1 ]
Cheng, Shengyi [1 ]
Zhang, Yu [1 ]
机构
[1] Jiangsu Normal Univ, Sch Comp Sci & Technol, Xuzhou 221116, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Software defect prediction; cross-version defect prediction; class overlap;
D O I
10.1142/S0218194024500414
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The class overlap problem refers to instances from different categories heavily overlapping in the feature space. This issue is one of the challenges in improving the performance of software defect prediction (SDP). Currently, the studies on the impact of class overlap on SDP mainly focused on within-project defect prediction and cross-project defect prediction. Moreover, the existing class overlap instances cleaning methods are not suitable for cross-version defect prediction. In this paper, we propose a class overlap instances cleaning method based on the Ratio of K-nearest neighbors with the Same Label (RKSL). This method removes instances with the abnormal neighbor ratio in the training set. Based on the RKSL method, we investigate the impact of class overlap on the performance and interpretability of the cross-version defect prediction model. The experiment results show that class overlap can affect the performance of cross-version defect prediction models significantly. The RKSL method can handle the class overlap problem in defect datasets, but it may impact the interpretability of models. Through the analysis of feature changes, we consider that class overlap instances cleaning can assist models in identifying more important features.
引用
收藏
页码:1895 / 1918
页数:24
相关论文
共 50 条
  • [41] Combined classifier for cross-project defect prediction: an extended empirical study
    Zhang, Yun
    Lo, David
    Xia, Xin
    Sun, Jianling
    FRONTIERS OF COMPUTER SCIENCE, 2018, 12 (02) : 280 - 296
  • [42] An empirical study toward dealing with noise and class imbalance issues in software defect prediction
    Pandey, Sushant Kumar
    Tripathi, Anil Kumar
    SOFT COMPUTING, 2021, 25 (21) : 13465 - 13492
  • [43] An empirical study toward dealing with noise and class imbalance issues in software defect prediction
    Sushant Kumar Pandey
    Anil Kumar Tripathi
    Soft Computing, 2021, 25 : 13465 - 13492
  • [44] WSBCV: A data-driven cross-version defect model via multi-objective optimization and incremental representation learning
    Zhang, Nana
    Zhu, Kun
    Ding, Weiping
    Zhu, Dandan
    INFORMATION SCIENCES, 2024, 669
  • [45] An Empirical Study on Multi-Source Cross-Project Defect Prediction Models
    Liu, Xuanying
    Li, Zonghao
    Zou, Jiaqi
    Tong, Haonan
    2022 29TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, APSEC, 2022, : 318 - 327
  • [46] Cross-Project Defect Prediction with Respect to Code Ownership Model: an Empirical Study
    Jureczko, Marian
    Madeyski, Lech
    E-INFORMATICA SOFTWARE ENGINEERING JOURNAL, 2015, 9 (01) : 21 - 35
  • [47] An Empirical Study of Ranking-Oriented Cross-Project Software Defect Prediction
    You, Guoan
    Wang, Feng
    Ma, Yutao
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2016, 26 (9-10) : 1511 - 1538
  • [48] LOCAL KEY ESTIMATION IN CLASSICAL MUSIC RECORDINGS: A CROSS-VERSION STUDY ON SCHUBERT'S WINTERREISE
    Schreiber, Hendrik
    Weiss, Christof
    Mueller, Meinard
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 501 - 505
  • [49] WGNCS: A robust hybrid cross-version defect model via multi-objective optimization and deep enhanced feature representation
    Zhang, Nana
    Ying, Shi
    Ding, Weiping
    Zhu, Kun
    Zhu, Dandan
    INFORMATION SCIENCES, 2021, 570 : 545 - 576
  • [50] An Empirical Study on Heterogeneous Defect Prediction Approaches
    Chen, Haowen
    Jing, Xiao-Yuan
    Li, Zhiqiang
    Wu, Di
    Peng, Yi
    Huang, Zhiguo
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2021, 47 (12) : 2803 - 2822