A Two-Phase Approach for Semi-Supervised Feature Selection

被引:0
|
作者
Saxena, Amit [1 ]
Pare, Shreya [2 ]
Meena, Mahendra Singh [2 ]
Gupta, Deepak [3 ]
Gupta, Akshansh [4 ]
Razzak, Imran [5 ]
Lin, Chin-Teng [2 ]
Prasad, Mukesh [2 ]
机构
[1] Guru Ghasidas Univ, Dept Comp Sci & Informat Technol, Bilaspur 495009, Chhattisgarh, India
[2] Univ Technol Sydney, Sch Comp Sci, FEIT, Sydney, NSW 2007, Australia
[3] Natl Inst Technol Arunachal Pradesh, Dept Comp Sci & Engn, Yupia 791112, India
[4] Cent Elect Engn Res Inst, Delhi 110028, India
[5] Deakin Univ, Sch Informat Technol, Geeloing, Vic 3217, Australia
基金
澳大利亚研究理事会;
关键词
feature selection; semi-supervised datasets; classification; clustering; correlation; RECOGNITION;
D O I
10.3390/a13090215
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a novel approach for selecting a subset of features in semi-supervised datasets where only some of the patterns are labeled. The whole process is completed in two phases. In the first phase, i.e., Phase-I, the whole dataset is divided into two parts: The first part, which contains labeled patterns, and the second part, which contains unlabeled patterns. In the first part, a small number of features are identified using well-known maximum relevance (from first part) and minimum redundancy (whole dataset) based feature selection approaches using the correlation coefficient. The subset of features from the identified set of features, which produces a high classification accuracy using any supervised classifier from labeled patterns, is selected for later processing. In the second phase, i.e., Phase-II, the patterns belonging to the first and second part are clustered separately into the available number of classes of the dataset. In the clusters of the first part, take the majority of patterns belonging to a cluster as the class for that cluster, which is given already. Form the pairs of cluster centroids made in the first and second part. The centroid of the second part nearest to a centroid of the first part will be paired. As the class of the first centroid is known, the same class can be assigned to the centroid of the cluster of the second part, which is unknown. The actual class of the patterns if known for the second part of the dataset can be used to test the classification accuracy of patterns in the second part. The proposed two-phase approach performs well in terms of classification accuracy and number of features selected on the given benchmarked datasets.
引用
收藏
页数:23
相关论文
共 50 条
  • [11] Joint Semi-Supervised Feature Selection and Classification through Bayesian Approach
    Jiang, Bingbing
    Wu, Xingyu
    Yu, Kui
    Chen, Huanhuan
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3983 - 3990
  • [12] Semi-supervised relevance index for feature selection
    Coelho, Frederico
    Castro, Cristiano
    Braga, Antonio P.
    Verleysen, Michel
    NEURAL COMPUTING & APPLICATIONS, 2019, 31 (Suppl 2): : 989 - 997
  • [13] Simple strategies for semi-supervised feature selection
    Sechidis, Konstantinos
    Brown, Gavin
    MACHINE LEARNING, 2018, 107 (02) : 357 - 395
  • [14] Locality sensitive semi-supervised feature selection
    Zhao, Jidong
    Lu, Ke
    He, Xiaofei
    NEUROCOMPUTING, 2008, 71 (10-12) : 1842 - 1849
  • [15] A recursive feature retention method for semi-supervised feature selection
    Qingqing Pang
    Li Zhang
    International Journal of Machine Learning and Cybernetics, 2021, 12 : 2639 - 2657
  • [16] A recursive feature retention method for semi-supervised feature selection
    Pang, Qingqing
    Zhang, Li
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (09) : 2639 - 2657
  • [17] Adaptive Feature Selection and Feature Fusion for Semi-supervised Classification
    Du, Wei
    Phlypo, Ronald
    Adali, Tulay
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2019, 91 (05): : 521 - 537
  • [18] Adaptive Feature Selection and Feature Fusion for Semi-supervised Classification
    Wei Du
    Ronald Phlypo
    Tülay Adalı
    Journal of Signal Processing Systems, 2019, 91 : 521 - 537
  • [19] A novel semi-supervised approach for feature extraction
    Qiu, Junyang
    Zhang, Yanyan
    Pan, Zhisong
    Yang, Haimin
    Ren, Huifeng
    Li, Xin
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 3765 - 3770
  • [20] A graph Laplacian based approach to semi-supervised feature selection for regression problems
    Doquire, Gauthier
    Verleysen, Michel
    NEUROCOMPUTING, 2013, 121 : 5 - 13