An Empirical Study for Enhanced Software Defect Prediction Using a Learning-Based Framework

被引:0
|
作者
Kamal Bashir
Tianrui Li
Chubato Wondaferaw Yohannese
机构
[1] Southwest Jiaotong University,School of Information Science and Technology
[2] Karary University,Department of Information Technology, College of Computer Science and Information Technology
关键词
Software defect prediction; Feature selection; Data sampling; Noise filtering;
D O I
暂无
中图分类号
学科分类号
摘要
The object of software defect prediction (SDP) is to identify defect-prone modules. This is achieved through constructing prediction models using datasets obtained by mining software historical depositories. However, data mined from these depositories are often associated with high dimensionality, class imbalance, and mislabels which deteriorate classification performance and increase model complexity. In order to mitigate the consequences, this paper proposes an integrated preprocessing framework in which feature selection (FS), data balance (DB), and noise filtering (NF) techniques are fused to deal with the factors that deteriorate learning performance. We apply the proposed framework on three software metrics, namely static code metric (SCM), object oriented metric (OOM), and combined metric (CombM) and build models based on four scenarios (S): (S1) original data; (S2) FS subsets; (S3) FS subsets after DB using random under sampling (RUS) and synthetic minority oversampling technique (SMOTE); (S4) FS subsets after DB (RUS and SMOTE); and NF using iterative partitioning filter (IPF) and iterative noise filtering based on the fusing of classifiers (INFFC). Empirical results show that 1. the integrated preprocessing of FS, DB, and NF improves the performance of all the models built for SDP, 2. for all FS methods, all the models improve performance progressively from S2 through to S4 in all the software metrics, 3. model performance based on S4 is statistically significantly better than the performance based on S3 for all the software metrics, and 4. in order to achieve optimal model performance for SDP, appropriate implementation of the proposed framework is required. The results also validate the effectiveness of our proposal and provide guidelines for achieving quality training data that enhances model performance for SDP.
引用
收藏
页码:282 / 298
页数:16
相关论文
共 50 条
  • [1] An Empirical Study for Enhanced Software Defect Prediction Using a Learning-Based Framework
    Bashir, Kamal
    Li, Tianrui
    Yohannese, Chubato Wondaferaw
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2019, 12 (01) : 282 - 298
  • [2] bjCnet: A contrastive learning-based framework for software defect prediction
    Han, Jiaxuan
    Huang, Cheng
    Liu, Jiayong
    COMPUTERS & SECURITY, 2024, 145
  • [3] A Systematic Study for Learning-Based Software Defect Prediction
    Cao, Han
    2020 4TH INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING AND ARTIFICIAL INTELLIGENCE (CCEAI 2020), 2020, 1487
  • [4] An empirical framework for defect prediction using machine learning techniques with Android software
    Malhotra, Ruchika
    APPLIED SOFT COMPUTING, 2016, 49 : 1034 - 1050
  • [5] Enhancing Software Defect Prediction Using Supervised-Learning Based Framework
    Bashir, Kamal
    Li, Tianrui
    Yohannese, Chubato Wondaferaw
    Mahama, Yahaya
    2017 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (IEEE ISKE), 2017,
  • [6] Software Defect Prediction using Propositionalization based Data Preprocessing: An Empirical Study
    Pak, CholMyong
    Wang, Tian Tian
    Su, Xiao Hong
    2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND BUSINESS ANALYTICS (ICDSBA 2018), 2018, : 71 - 77
  • [7] An Empirical Study on Software Defect Prediction Using CodeBERT Model
    Pan, Cong
    Lu, Minyan
    Xu, Biao
    APPLIED SCIENCES-BASEL, 2021, 11 (11):
  • [8] Empirical assessment of machine learning based software defect prediction techniques
    Challagulla, VUB
    Bastani, FB
    Yen, IL
    Paul, RA
    WORDS 2005: 10TH IEEE INTERNATIONAL WORKSHOP ON OBJECT-ORIENTED REAL-TIME DEPENDABLE, PROCEEDINGS, 2005, : 263 - 270
  • [9] Empirical assessment of machine learning based software defect prediction techniques
    Challagulla, Venkata Udaya B.
    Bastani, Farokh B.
    Yen, I-Ling
    Paul, Raymond A.
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2008, 17 (02) : 389 - 400
  • [10] Software Defect Prediction Based on Machine Learning and Deep Learning Techniques: An Empirical Approach
    Albattah, Waleed
    Alzahrani, Musaad
    AI, 2024, 5 (04) : 1743 - 1758