An Empirical Study for Enhanced Software Defect Prediction Using a Learning-Based Framework

被引:0
|
作者
Kamal Bashir
Tianrui Li
Chubato Wondaferaw Yohannese
机构
[1] Southwest Jiaotong University,School of Information Science and Technology
[2] Karary University,Department of Information Technology, College of Computer Science and Information Technology
来源
International Journal of Computational Intelligence Systems | 2018年 / 12卷
关键词
Software defect prediction; Feature selection; Data sampling; Noise filtering;
D O I
暂无
中图分类号
学科分类号
摘要
The object of software defect prediction (SDP) is to identify defect-prone modules. This is achieved through constructing prediction models using datasets obtained by mining software historical depositories. However, data mined from these depositories are often associated with high dimensionality, class imbalance, and mislabels which deteriorate classification performance and increase model complexity. In order to mitigate the consequences, this paper proposes an integrated preprocessing framework in which feature selection (FS), data balance (DB), and noise filtering (NF) techniques are fused to deal with the factors that deteriorate learning performance. We apply the proposed framework on three software metrics, namely static code metric (SCM), object oriented metric (OOM), and combined metric (CombM) and build models based on four scenarios (S): (S1) original data; (S2) FS subsets; (S3) FS subsets after DB using random under sampling (RUS) and synthetic minority oversampling technique (SMOTE); (S4) FS subsets after DB (RUS and SMOTE); and NF using iterative partitioning filter (IPF) and iterative noise filtering based on the fusing of classifiers (INFFC). Empirical results show that 1. the integrated preprocessing of FS, DB, and NF improves the performance of all the models built for SDP, 2. for all FS methods, all the models improve performance progressively from S2 through to S4 in all the software metrics, 3. model performance based on S4 is statistically significantly better than the performance based on S3 for all the software metrics, and 4. in order to achieve optimal model performance for SDP, appropriate implementation of the proposed framework is required. The results also validate the effectiveness of our proposal and provide guidelines for achieving quality training data that enhances model performance for SDP.
引用
收藏
页码:282 / 298
页数:16
相关论文
共 50 条
  • [11] An empirical study of software entropy based bug prediction using machine learning
    Kaur A.
    Kaur K.
    Chopra D.
    International Journal of System Assurance Engineering and Management, 2017, 8 (Suppl 2) : 599 - 616
  • [12] An Empirical Study on Software Defect Prediction using Function Point Analysis
    Zhao, Xinghan
    Tian, Cong
    2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2022, : 167 - 176
  • [13] Machine learning-based defect prediction model using multilayer perceptron algorithm for escalating the reliability of the software
    Juneja, Sapna
    Nauman, Ali
    Uppal, Mudita
    Gupta, Deepali
    Alroobaea, Roobaea
    Muminov, Bahodir
    Tao, Yuning
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (07): : 10122 - 10147
  • [14] Machine learning-based defect prediction model using multilayer perceptron algorithm for escalating the reliability of the software
    Sapna Juneja
    Ali Nauman
    Mudita Uppal
    Deepali Gupta
    Roobaea Alroobaea
    Bahodir Muminov
    Yuning Tao
    The Journal of Supercomputing, 2024, 80 : 10122 - 10147
  • [15] Continuous Defect Prediction in CI/CD Pipelines: A Machine Learning-Based Framework
    Giorgio, Lazzarinetti
    Nicola, Massarenti
    Fabio, Sgro
    Andrea, Salafia
    AIXIA 2021 - ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13196 : 591 - 606
  • [16] An Empirical Validation of Learning Schemes Using an Automated Genetic Defect Prediction Framework
    Murillo-Morera, Juan
    Castro-Herrera, Carlos
    Arroyo, Javier
    Fuentes-Fernandez, Ruben
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2016, 2016, 10022 : 222 - 234
  • [17] Software defect prediction using K-PCA and various kernel-based extreme learning machine: an empirical study
    Pandey, Sushant Kumar
    Rathee, Deevashwer
    Tripathi, Anil Kumar
    IET SOFTWARE, 2020, 14 (07) : 768 - 782
  • [18] A study on software metrics based software defect prediction using data mining and machine learning techniques
    Prasad, Manjula C.M.
    Florence, Lilly
    Arya, Arti
    International Journal of Database Theory and Application, 2015, 8 (03): : 179 - 190
  • [19] Machine Learning-Based Software Defect Prediction for Mobile Applications: A Systematic Literature Review
    Jorayeva, Manzura
    Akbulut, Akhan
    Catal, Cagatay
    Mishra, Alok
    SENSORS, 2022, 22 (07)
  • [20] Machine Learning-Based Reliability Evaluation for Software Defect Prediction and Model Validation Assessment
    Kovur, Krishna Mohan
    Shaik, Harun-Ul-Rasheed
    Verma, Ajit Kumar
    Srividya, A.
    INTERNATIONAL JOURNAL OF RELIABILITY QUALITY AND SAFETY ENGINEERING, 2025,