A Two-Phase Approach for Semi-Supervised Feature Selection

被引:0
|
作者
Saxena, Amit [1 ]
Pare, Shreya [2 ]
Meena, Mahendra Singh [2 ]
Gupta, Deepak [3 ]
Gupta, Akshansh [4 ]
Razzak, Imran [5 ]
Lin, Chin-Teng [2 ]
Prasad, Mukesh [2 ]
机构
[1] Guru Ghasidas Univ, Dept Comp Sci & Informat Technol, Bilaspur 495009, Chhattisgarh, India
[2] Univ Technol Sydney, Sch Comp Sci, FEIT, Sydney, NSW 2007, Australia
[3] Natl Inst Technol Arunachal Pradesh, Dept Comp Sci & Engn, Yupia 791112, India
[4] Cent Elect Engn Res Inst, Delhi 110028, India
[5] Deakin Univ, Sch Informat Technol, Geeloing, Vic 3217, Australia
基金
澳大利亚研究理事会;
关键词
feature selection; semi-supervised datasets; classification; clustering; correlation; RECOGNITION;
D O I
10.3390/a13090215
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a novel approach for selecting a subset of features in semi-supervised datasets where only some of the patterns are labeled. The whole process is completed in two phases. In the first phase, i.e., Phase-I, the whole dataset is divided into two parts: The first part, which contains labeled patterns, and the second part, which contains unlabeled patterns. In the first part, a small number of features are identified using well-known maximum relevance (from first part) and minimum redundancy (whole dataset) based feature selection approaches using the correlation coefficient. The subset of features from the identified set of features, which produces a high classification accuracy using any supervised classifier from labeled patterns, is selected for later processing. In the second phase, i.e., Phase-II, the patterns belonging to the first and second part are clustered separately into the available number of classes of the dataset. In the clusters of the first part, take the majority of patterns belonging to a cluster as the class for that cluster, which is given already. Form the pairs of cluster centroids made in the first and second part. The centroid of the second part nearest to a centroid of the first part will be paired. As the class of the first centroid is known, the same class can be assigned to the centroid of the cluster of the second part, which is unknown. The actual class of the patterns if known for the second part of the dataset can be used to test the classification accuracy of patterns in the second part. The proposed two-phase approach performs well in terms of classification accuracy and number of features selected on the given benchmarked datasets.
引用
收藏
页数:23
相关论文
共 50 条
  • [41] BASSUM: A Bayesian semi-supervised method for classification feature selection
    Cai, Ruichu
    Zhang, Zhenjie
    Hao, Zhifeng
    PATTERN RECOGNITION, 2011, 44 (04) : 811 - 820
  • [42] Semi-Supervised Discriminant Feature Selection for Hyperspectral Imagery Classification
    Dong, Chunhua
    Naghedolfeizi, Masoud
    Aberra, Dawit
    Zeng, Xiangyan
    ALGORITHMS, TECHNOLOGIES, AND APPLICATIONS FOR MULTISPECTRAL AND HYPERSPECTRAL IMAGERY XXV, 2019, 10986
  • [43] Efficient multi-view semi-supervised feature selection
    Zhang, Chenglong
    Jiang, Bingbing
    Wang, Zidong
    Yang, Jie
    Lu, Yangfeng
    Wu, Xingyu
    Sheng, Weiguo
    INFORMATION SCIENCES, 2023, 649
  • [44] Clustering-based Feature Selection in Semi-supervised Problems
    Quinzan, Ianisse
    Sotoca, Jose M.
    Pla, Filiberto
    2009 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2009, : 535 - 540
  • [45] Feature selection and semi-supervised clustering using multiobjective optimization
    Saha, Sriparna
    Ekbal, Asif
    Alok, Abhay Kumar
    Spandana, Rachamadugu
    SPRINGERPLUS, 2014, 3
  • [46] Semi-supervised Feature Selection via Rescaled Linear Regression
    Chen, Xiaojun
    Nie, Feiping
    Yuan, Guowen
    Huang, Joshua Zhexue
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1525 - 1531
  • [47] Graph Representation Learning Enhanced Semi-Supervised Feature Selection
    Tan, Jun
    Qi, Zhifeng
    Gui, Ning
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 18 (09)
  • [48] Semi-Supervised Local-Learning-based Feature Selection
    Wang, Jim Jing-Yan
    Yao, Jin
    Sun, Yijun
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 1942 - 1948
  • [49] Semi-supervised feature selection based on local discriminative information
    Zeng, Zhiqiang
    Wang, Xiaodong
    Zhang, Jian
    Wu, Qun
    NEUROCOMPUTING, 2016, 173 : 102 - 109
  • [50] Feature Selection and Model Optimization for Semi-supervised Speaker Spotting
    Chetupalli, Srikanth Raj
    Gopalakrishnan, Anand
    Sreenivas, Thippur V.
    2016 24TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2016, : 135 - 139