Semi-supervised learning for software quality estimation

被引:0
|
作者
Seliya, N [1 ]
Khoshgoftaar, TM [1 ]
Zhong, S [1 ]
机构
[1] Florida Atlantic Univ, Dept Comp Sci & Engn, Boca Raton, FL 33431 USA
关键词
semi-supervised learning; software quality estimation; unlabeled data; software metrics; expectation maximization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A software quality estimation model is often built using known software metrics and fault data obtained from program modules of previously developed releases or similar projects. Such a supervised learning approach to software quality estimation assumes that fault data is available for all the previously developed modules. Considering the various practical issues in software project development, fault data may not be available for all the software modules in the training data. More specifically, the available labeled training data is such that a supervised learning approach may not yield good software quality prediction. In contrast, a supervised classification scheme aided by unlabeled data, i.e., semi-supervised learning, may yield better results. This paper investigates semi-supervised learning with the Expectation Maximization (EM) algorithm for the software quality classification problem. Case studies of software measurement data obtained from two NASA software projects, JM1 and KC2, are used in our empirical investigation. A small portion of the JM1 dataset is randomly extracted and used as the labeled data, while the remaining JM1 instances are used as unlabeled data. The performance of the semi-supervised classification models built using the EM algorithm is evaluated by using the KC2 project as a test dataset. It is shown that the EM-based semi-supervised learning scheme improves the predictive accuracy of the software quality classification models.
引用
收藏
页码:183 / 190
页数:8
相关论文
共 50 条
  • [1] Software quality estimation with limited fault data: a semi-supervised learning perspective
    Seliya, Naeem
    Khoshgoftaar, Taghi M.
    SOFTWARE QUALITY JOURNAL, 2007, 15 (03) : 327 - 344
  • [2] Software quality estimation with limited fault data: a semi-supervised learning perspective
    Naeem Seliya
    Taghi M. Khoshgoftaar
    Software Quality Journal, 2007, 15 : 327 - 344
  • [3] Software fault localization using semi-supervised learning
    Zheng, Wei
    Wu, Xiaoxue
    Tan, Xin
    Peng, Yaopeng
    Yang, Shuai
    Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2015, 33 (02): : 332 - 336
  • [4] FRUGAL: Unlocking Semi-Supervised Learning for Software Analytics
    Tu, Huy
    Menzies, Tim
    2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021, 2021, : 394 - 406
  • [5] Semi-supervised Statistical Learning Systems Using a Posterior External Quality Estimation
    Evgeny, Shvets
    Lev, Teplyakov
    Ekaterina, Pavlova
    ELEVENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2018), 2019, 11041
  • [6] Semi-supervised Learning
    Adams, Niall
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2009, 172 : 530 - 530
  • [7] On semi-supervised learning
    A. Cholaquidis
    R. Fraiman
    M. Sued
    TEST, 2020, 29 : 914 - 937
  • [8] On semi-supervised learning
    Cholaquidis, A.
    Fraiman, R.
    Sued, M.
    TEST, 2020, 29 (04) : 914 - 937
  • [9] Semi-supervised learning with density-ratio estimation
    Masanori Kawakita
    Takafumi Kanamori
    Machine Learning, 2013, 91 : 189 - 209
  • [10] Semi-supervised learning with density-ratio estimation
    Kawakita, Masanori
    Kanamori, Takafumi
    MACHINE LEARNING, 2013, 91 (02) : 189 - 209