Semi-supervised learning for software quality estimation

被引:0
|
作者
Seliya, N [1 ]
Khoshgoftaar, TM [1 ]
Zhong, S [1 ]
机构
[1] Florida Atlantic Univ, Dept Comp Sci & Engn, Boca Raton, FL 33431 USA
关键词
semi-supervised learning; software quality estimation; unlabeled data; software metrics; expectation maximization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A software quality estimation model is often built using known software metrics and fault data obtained from program modules of previously developed releases or similar projects. Such a supervised learning approach to software quality estimation assumes that fault data is available for all the previously developed modules. Considering the various practical issues in software project development, fault data may not be available for all the software modules in the training data. More specifically, the available labeled training data is such that a supervised learning approach may not yield good software quality prediction. In contrast, a supervised classification scheme aided by unlabeled data, i.e., semi-supervised learning, may yield better results. This paper investigates semi-supervised learning with the Expectation Maximization (EM) algorithm for the software quality classification problem. Case studies of software measurement data obtained from two NASA software projects, JM1 and KC2, are used in our empirical investigation. A small portion of the JM1 dataset is randomly extracted and used as the labeled data, while the remaining JM1 instances are used as unlabeled data. The performance of the semi-supervised classification models built using the EM algorithm is evaluated by using the KC2 project as a test dataset. It is shown that the EM-based semi-supervised learning scheme improves the predictive accuracy of the software quality classification models.
引用
收藏
页码:183 / 190
页数:8
相关论文
共 50 条
  • [31] Deep Semi-Supervised Learning
    Hailat, Zeyad
    Komarichev, Artem
    Chen, Xue-Wen
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2154 - 2159
  • [32] Reliable Semi-supervised Learning
    Shao, Junming
    Huang, Chen
    Yang, Qinli
    Luo, Guangchun
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, : 1197 - 1202
  • [33] Semi-supervised Learning with Transfer Learning
    Zhou, Huiwei
    Zhang, Yan
    Huang, Degen
    Li, Lishuang
    CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, 2013, 8208 : 109 - 119
  • [34] Semi-supervised learning with dropouts
    Abhishek
    Yadav, Rakesh Kumar
    Verma, Shekhar
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 215
  • [35] PRIVILEGED SEMI-SUPERVISED LEARNING
    Chen, Xingyu
    Gong, Chen
    Ma, Chao
    Huang, Xiaolin
    Yang, Jie
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 2999 - 3003
  • [36] Introduction to semi-supervised learning
    Goldberg, Xiaojin
    Synthesis Lectures on Artificial Intelligence and Machine Learning, 2009, 6 : 1 - 116
  • [37] A survey on semi-supervised learning
    Van Engelen, Jesper E.
    Hoos, Holger H.
    MACHINE LEARNING, 2020, 109 (02) : 373 - 440
  • [38] On Semi-Supervised Learning and Sparsity
    Balinsky, Alexander
    Balinsky, Helen
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 3083 - +
  • [39] Semi-supervised learning with trees
    Kemp, C
    Griffiths, TL
    Stromsten, S
    Tenenbaum, JB
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 257 - 264
  • [40] Human Semi-Supervised Learning
    Gibson, Bryan R.
    Rogers, Timothy T.
    Zhu, Xiaojin
    TOPICS IN COGNITIVE SCIENCE, 2013, 5 (01) : 132 - 172