Software Defect Prediction Method Based on Clustering Ensemble Learning

被引:0
|
作者
Tao, Hongwei [1 ]
Cao, Qiaoling [1 ]
Chen, Haoran [1 ]
Li, Yanting [1 ]
Niu, Xiaoxu [1 ]
Wang, Tao [1 ]
Geng, Zhenhao [1 ]
Shang, Songtao [1 ]
机构
[1] Zhengzhou Univ Light Ind, Sch Comp Sci & Technol, Zhengzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
clustering ensemble learning; feature selection; software defect prediction; FEATURE-SELECTION; QUALITY;
D O I
10.1049/2024/6294422
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The technique of software defect prediction aims to assess and predict potential defects in software projects and has made significant progress in recent years within software development. In previous studies, this technique largely relied on supervised learning methods, requiring a substantial amount of labeled historical defect data to train the models. However, obtaining these labeled data often demands significant time and resources. In contrast, software defect prediction based on unsupervised learning does not depend on known labeled data, eliminating the need for large-scale data labeling, thereby saving considerable time and resources while providing a more flexible solution for ensuring software quality. This paper conducts software defect prediction using unsupervised learning methods on data from 16 projects across two public datasets (PROMISE and NASA). During the feature selection step, a chi-squared sparse feature selection method is proposed. This feature selection strategy combines chi-squared tests with sparse principal component analysis (SPCA). Specifically, the chi-squared test is first used to filter out the most statistically significant features, and then the SPCA is applied to reduce the dimensionality of these significant features. In the clustering step, the dot product matrix and Pearson correlation coefficient (PCC) matrix are used to construct weighted adjacency matrices, and a clustering overlap method is proposed. This method integrates spectral clustering, Newman clustering, fluid clustering, and Clauset-Newman-Moore (CNM) clustering through ensemble learning. Experimental results indicate that, in the absence of labeled data, using the chi-squared sparse method for feature selection demonstrates superior performance, and the proposed clustering overlap method outperforms or is comparable to the effectiveness of the four baseline clustering methods.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] Software Defect Prediction Using an Intelligent Ensemble-Based Model
    Ali, Misbah
    Mazhar, Tehseen
    Arif, Yasir
    Al-Otaibi, Shaha
    Ghadi, Yazeed Yasin
    Shahzad, Tariq
    Khan, Muhammad Amir
    Hamam, Habib
    IEEE ACCESS, 2024, 12 : 20376 - 20395
  • [32] SMOTE-Based Homogeneous Ensemble Methods for Software Defect Prediction
    Balogun, Abdullateef O.
    Lafenwa-Balogun, Fatimah B.
    Mojeed, Hammed A.
    Adeyemo, Victor E.
    Akande, Oluwatobi N.
    Akintola, Abimbola G.
    Bajeh, Amos O.
    Usman-Hamza, Fatimah E.
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2020, PT VI, 2020, 12254 : 615 - 631
  • [33] A method of multidimensional software aging prediction based on ensemble learning: A case of Android OS
    Nie, Yuge
    Chen, Yulei
    Jiang, Yujia
    Wu, Huayao
    Yin, Beibei
    Cai, Kai -Yuan
    INFORMATION AND SOFTWARE TECHNOLOGY, 2024, 170
  • [34] SHSE: A subspace hybrid sampling ensemble method for software defect number prediction
    Tong, Haonan
    Lu, Wei
    Xing, Weiwei
    Liu, Bin
    Wang, Shihai
    INFORMATION AND SOFTWARE TECHNOLOGY, 2022, 142
  • [35] Software defect prediction ensemble learning algorithm based on 2-step sparrow optimizing extreme learning machine
    Tang, Yu
    Dai, Qi
    Yang, Mengyuan
    Chen, Lifang
    Du, Ye
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (08): : 11119 - 11148
  • [36] A Survey of Software Defect Prediction Based on Deep Learning
    Meetesh Nevendra
    Pradeep Singh
    Archives of Computational Methods in Engineering, 2022, 29 : 5723 - 5748
  • [37] A Survey of Software Defect Prediction Based on Deep Learning
    Nevendra, Meetesh
    Singh, Pradeep
    ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2022, 29 (07) : 5723 - 5748
  • [38] Kernel Based Asymmetric Learning for Software Defect Prediction
    Ma, Ying
    Luo, Guangchun
    Chen, Hao
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (01) : 267 - 270
  • [39] A CLUSTERING ENSEMBLE LEARNING METHOD BASED ON THE ANT COLONY CLUSTERING ALGORITHM
    Parvin, H.
    Beigi, A.
    Mozayani, N.
    APPLIED AND COMPUTATIONAL MATHEMATICS, 2012, 11 (02) : 286 - 302
  • [40] Software Defect Prediction using Deep Learning by Correlation Clustering of Testing Metrics
    Sharma, Kamal Kant
    Sinha, Amit
    Sharma, Arun
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2022, 13 (10) : 953 - 960