Semi supervised approach towards subspace clustering

被引:5
|
作者
Harikumar, Sandhya [1 ]
Akhil, A. S. [1 ]
机构
[1] Amrita Vishwa Vidyapeetham, Dept Comp Sci & Engn, Amritapuri, India
关键词
Subspace clustering; semi-supervised; information gain; entropy;
D O I
10.3233/JIFS-169456
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High-dimensional data analysis is quite inevitable due to emerging technologies in various domains such as finance, healthcare, genomics and signal processing. Though data sets generated in these domains are high-dimensional, intrinsic dimensions that provide meaningful information are often much smaller. Conventionally, unsupervised clustering methods known as subspace clustering are utilized for finding clusters in different subspaces of high dimensional data, by identifying relevant features, irrespective of labels associated with each instance. Available label information, if incorporated in clustering algorithm, can bias the algorithm towards solutions more consistent with our knowledge, leading to improved cluster quality. Therefore, an Information Gain based Semi-supervised-subspace Clustering (IGSC) is proposed that identifies a subset of important attributes based on the known label for each data instance. The information about the labels associated with data sets is integrated with the search strategy for subspaces to leverage them into a model based clustering approach. Our experimentation on 13 real world labeled data sets proves the feasibility of IGSC and we validate the clusters obtained, using an improvised Davies Bouldin Index (DBI) for semi-supervised clusters.
引用
收藏
页码:1619 / 1629
页数:11
相关论文
共 50 条
  • [31] Pseudo-Supervised Deep Subspace Clustering
    Lv, Juncheng
    Kang, Zhao
    Lu, Xiao
    Xu, Zenglin
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 5252 - 5263
  • [32] Multiscale and Auto-Tuned Semi-Supervised Deep Subspace Clustering and Its Application in Brain Tumor Clustering
    Qian, Zhenyu
    Jiang, Yizhang
    Hong, Zhou
    Huang, Lijun
    Li, Fengda
    Lai, Khinwee
    Xia, Kaijian
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 79 (03): : 4741 - 4762
  • [33] Semi-Supervised Billinear Subspace Learning
    Xu, Dong
    Yan, Shuicheng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2009, 18 (07) : 1671 - 1676
  • [34] A semi-supervised approach to projected clustering with applications to microarray data
    Yip, Kevin Y.
    Cheung, Lin
    Cheung, David W.
    Jing, Liping
    Ng, Michael K.
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2009, 3 (03) : 229 - 259
  • [35] A Graph-Based Projection Approach for Semi-supervised Clustering
    Yoshida, Tetsuya
    Okatani, Kazuhiro
    KNOWLEDGE MANAGEMENT AND ACQUISITION FOR SMART SYSTEMS AND SERVICES, 2010, 6232 : 1 - 13
  • [36] A HYBRID APPROACH TO SELECTING INFORMATIVE CONSTRAINTS FOR SEMI-SUPERVISED CLUSTERING
    Ni, Xianhua
    Yang, Yan
    UNCERTAINTY MODELING IN KNOWLEDGE ENGINEERING AND DECISION MAKING, 2012, 7 : 833 - 838
  • [37] A New Approach for Semi-supervised Fuzzy Clustering with Multiple Fuzzifiers
    Tran Manh Tuan
    Mai Dinh Sinh
    Tran Đinh Khang
    Phung The Huan
    Tran Thi Ngan
    Nguyen Long Giang
    Vu Duc Thai
    International Journal of Fuzzy Systems, 2022, 24 : 3688 - 3701
  • [38] A genetic semi-supervised fuzzy clustering approach to text classification
    Liu, H
    Huang, ST
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2003, 2762 : 173 - 180
  • [39] TESC: An approach to TExt classification using Semi-supervised Clustering
    Zhang, Wen
    Tang, Xijin
    Yoshida, Taketoshi
    KNOWLEDGE-BASED SYSTEMS, 2015, 75 : 152 - 160
  • [40] A Novel Multiple Kernel Learning Approach for Semi-Supervised Clustering
    Zare, T.
    Sadeghi, M. T.
    Abutalebi, H. R.
    2013 8TH IRANIAN CONFERENCE ON MACHINE VISION & IMAGE PROCESSING (MVIP 2013), 2013, : 451 - 456