An Empirical Study for Enhanced Software Defect Prediction Using a Learning-Based Framework

被引:0
|
作者
Kamal Bashir
Tianrui Li
Chubato Wondaferaw Yohannese
机构
[1] Southwest Jiaotong University,School of Information Science and Technology
[2] Karary University,Department of Information Technology, College of Computer Science and Information Technology
来源
International Journal of Computational Intelligence Systems | 2018年 / 12卷
关键词
Software defect prediction; Feature selection; Data sampling; Noise filtering;
D O I
暂无
中图分类号
学科分类号
摘要
The object of software defect prediction (SDP) is to identify defect-prone modules. This is achieved through constructing prediction models using datasets obtained by mining software historical depositories. However, data mined from these depositories are often associated with high dimensionality, class imbalance, and mislabels which deteriorate classification performance and increase model complexity. In order to mitigate the consequences, this paper proposes an integrated preprocessing framework in which feature selection (FS), data balance (DB), and noise filtering (NF) techniques are fused to deal with the factors that deteriorate learning performance. We apply the proposed framework on three software metrics, namely static code metric (SCM), object oriented metric (OOM), and combined metric (CombM) and build models based on four scenarios (S): (S1) original data; (S2) FS subsets; (S3) FS subsets after DB using random under sampling (RUS) and synthetic minority oversampling technique (SMOTE); (S4) FS subsets after DB (RUS and SMOTE); and NF using iterative partitioning filter (IPF) and iterative noise filtering based on the fusing of classifiers (INFFC). Empirical results show that 1. the integrated preprocessing of FS, DB, and NF improves the performance of all the models built for SDP, 2. for all FS methods, all the models improve performance progressively from S2 through to S4 in all the software metrics, 3. model performance based on S4 is statistically significantly better than the performance based on S3 for all the software metrics, and 4. in order to achieve optimal model performance for SDP, appropriate implementation of the proposed framework is required. The results also validate the effectiveness of our proposal and provide guidelines for achieving quality training data that enhances model performance for SDP.
引用
收藏
页码:282 / 298
页数:16
相关论文
共 50 条
  • [31] Dictionary Learning Based Software Defect Prediction
    Jing, Xiao-Yuan
    Ying, Shi
    Zhang, Zhi-Wu
    Wu, Shan-Shan
    Liu, Jin
    36TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2014), 2014, : 414 - 423
  • [32] Software defect prediction: A study on software metrics using statistical and machine learning methods
    Canaparo, Marco
    Ronchierr, Elisabetta
    Bertaccini, Gianluca
    INTERNATIONAL SYMPOSIUM ON GRIDS & CLOUDS 2022, 2022,
  • [33] Deep learning or classical machine learning? An empirical study on line-level software defect prediction
    Zhou, Yufei
    Liu, Xutong
    Guo, Zhaoqiang
    Zhou, Yuming
    Zhang, Corey
    Qian, Junyan
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2024, 36 (10)
  • [34] Software Defect Prediction Using Call Graph Based Ranking (CGBR) Framework
    Turhan, Burak
    Kocak, Gozde
    Bener, Ayse
    PROCEEDINGS OF THE 34TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS, 2008, : 191 - 198
  • [35] Deployment and performance monitoring of docker based federated learning framework for software defect prediction
    Malhotra, Ruchika
    Bansal, Anjali
    Kessentini, Marouane
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (05): : 6039 - 6057
  • [36] An Empirical Study of IR-based Bug Localization for Deep Learning-based Software
    Kim, Misoo
    Kim, Youngkyoung
    Lee, Eunseok
    2022 IEEE 15TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST 2022), 2022, : 128 - 139
  • [37] An empirical study of software reliability prediction using machine learning techniques
    Kumar, Pradeep
    Singh, Yogesh
    International Journal of System Assurance Engineering and Management, 2012, 3 (03) : 194 - 208
  • [38] Refined Software Defect Prediction Using Enhanced JAYA Optimization and Extreme Learning Machine
    Pradhan, Debasish
    Muduli, Debendra
    Zamani, Abu Taha
    Yaqoob, Syed Irfan
    Alanazi, Sultan M.
    Kumar, Rakesh Ranjan
    Parveen, Nikhat
    Shameem, Mohammad
    IEEE ACCESS, 2024, 12 : 141559 - 141579
  • [39] An empirical study on software defect prediction with a simplified metric set
    He, Peng
    Li, Bing
    Liu, Xiao
    Chen, Jun
    Ma, Yutao
    INFORMATION AND SOFTWARE TECHNOLOGY, 2015, 59 : 170 - 190
  • [40] An Empirical Study on Regression Techniques for Software Defect Number Prediction
    Wang, Shihan
    He, Yuxin
    Shi, Rongrong
    Jing, Chiyuan
    Liu, Ying
    Tong, Haonan
    PROCEEDINGS OF THE 2023 30TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, APSEC 2023, 2023, : 637 - 638