Estimation of Locally Relevant Subspace in High-dimensional Data

被引:6
|
作者
Thudumu, Srikanth [1 ]
Branch, Philip [1 ]
Jin, Jiong [1 ]
Singh, Jugdutt [2 ]
机构
[1] Swinburne Univ Technol, Melbourne, Vic, Australia
[2] Sarawak State Govt, Kuching, Malaysia
关键词
High-dimensionality problem; Subspace methods; Outlier Detection; Locally Relevant subspace; The curse of dimensionality problem; OUTLIER DETECTION;
D O I
10.1145/3373017.3373032
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
High-dimensional data is becoming more and more available due to the advent of big data and IoT. Having more dimensions makes data analysis cumbersome increasing the sparsity of data points due to the problem called "curse of dimensionality". To address this problem, global dimensionality reduction techniques are used; however, these techniques are ineffective in revealing hidden outliers from the high-dimensional space. This is due to the behaviour of outliers being hidden in the subspace where they belong; hence, a locally relevant subspace is needed to reveal the hidden outliers. In this paper, we present a technique that identifies a locally relevant subspace and associated low-dimensional subspaces by deriving a final correlation score. To verify the effectiveness of the technique in determining the generalised locally relevant subspace, we evaluate the results with a benchmark data set. Our comparative analysis shows that the technique derived the locally relevant subspace that consists of relevant dimensions presented in benchmark data set.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Multivariate functional subspace classification for high-dimensional longitudinal data
    Fukuda, Tatsuya
    Matsui, Hidetoshi
    Takada, Hiroya
    Misumi, Toshihiro
    Konishi, Sadanori
    JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE, 2024, 7 (01) : 1 - 16
  • [22] A modular eigen subspace scheme for high-dimensional data classification
    Chang, YL
    Han, CC
    Jou, FD
    Fan, KC
    Chen, KS
    Chang, JH
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2004, 20 (07): : 1131 - 1143
  • [23] High-Dimensional Matched Subspace Detection When Data are Missing
    Balzano, Laura
    Recht, Benjamin
    Nowak, Robert
    2010 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, 2010, : 1638 - 1642
  • [24] Subspace-Weighted Consensus Clustering for High-Dimensional Data
    Cai, Xiaosha
    Huang, Dong
    ADVANCED DATA MINING AND APPLICATIONS, 2020, 12447 : 3 - 16
  • [25] Locally differentially private high-dimensional data synthesis
    Chen, Xue
    Wang, Cheng
    Yang, Qing
    Hu, Teng
    Jiang, Changjun
    SCIENCE CHINA-INFORMATION SCIENCES, 2023, 66 (01)
  • [26] Locally differentially private high-dimensional data synthesis
    Xue Chen
    Cheng Wang
    Qing Yang
    Teng Hu
    Changjun Jiang
    Science China Information Sciences, 2023, 66
  • [27] Locally differentially private high-dimensional data synthesis
    Xue CHEN
    Cheng WANG
    Qing YANG
    Teng HU
    Changjun JIANG
    ScienceChina(InformationSciences), 2023, 66 (01) : 25 - 42
  • [28] Improved Estimation of High-dimensional Additive Models Using Subspace Learning
    He, Shiyuan
    He, Kejun
    Huang, Jianhua Z.
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2022, 31 (03) : 866 - 876
  • [29] State estimation from high-dimensional data
    Solo, V
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SENSOR ARRAY AND MULTICHANNEL SIGNAL PROCESSING SIGNAL PROCESSING THEORY AND METHODS, 2004, : 685 - 688
  • [30] Efficient Density Estimation for High-Dimensional Data
    Majdara, Aref
    Nooshabadi, Saeid
    IEEE ACCESS, 2022, 10 : 16592 - 16608