High-dimensional labeled data analysis with topology representing graphs

被引:20
|
作者
Aupetit, M [1 ]
Catz, T [1 ]
机构
[1] CEA, DAM, Dept Anal Surveillance Environm, F-91680 Bruyeres Le Chatel, France
关键词
exploratory data analysis; high-dimensional labeled data; topology representing graph; Delaunay graph; Gabriel graph; Voronoi cell; classification; decision boundary;
D O I
10.1016/j.neucom.2004.04.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose the use of topology representing graphs for the exploratory analysis of high-dimensional labeled data. The Delaunay graph contains all the topological information needed to analyze the topology of the classes (e.g. the number of separate clusters of a given class, the way these clusters are in contact with each other or the shape of these clusters). The Delaunay graph also allows to sample the decision boundary of the Nearest Neighbor rule, to define a topological criterion of non-linear separability of the classes and to find data which are near the decision boundary so that their label must be considered carefully. This graph then provides a way to analyze the complexity of a classification problem, and tools for decision support. When the Delaunay graph is not tractable in too high-dimensional spaces, we propose to use the Gabriel graph instead and discuss the limits of this approach. This analysis technique is complementary with projection techniques, as it allows to handle the data as they are in the data space, avoiding projection distortions. We apply it to analyze the well-known Iris database and a seismic events database. (C) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:139 / 169
页数:31
相关论文
共 50 条
  • [11] Learning High-Dimensional Differential Graphs From Multiattribute Data
    Tugnait, Jitendra K.
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 415 - 431
  • [12] Global Binary Optimization on Graphs for Classification of High-Dimensional Data
    Merkurjev, Ekaterina
    Bae, Egil
    Bertozzi, Andrea L.
    Tai, Xue-Cheng
    JOURNAL OF MATHEMATICAL IMAGING AND VISION, 2015, 52 (03) : 414 - 435
  • [13] Global Binary Optimization on Graphs for Classification of High-Dimensional Data
    Ekaterina Merkurjev
    Egil Bae
    Andrea L. Bertozzi
    Xue-Cheng Tai
    Journal of Mathematical Imaging and Vision, 2015, 52 : 414 - 435
  • [14] Topology of high-dimensional chaotic scattering
    Lai, YC
    de Moura, APS
    Grebogi, C
    PHYSICAL REVIEW E, 2000, 62 (05): : 6421 - 6428
  • [15] Percolation on High-Dimensional Product Graphs
    Diskin, Sahar
    Erde, Joshua
    Kang, Mihyun
    Krivelevich, Michael
    RANDOM STRUCTURES & ALGORITHMS, 2025, 66 (01)
  • [16] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
    Verleysen, Michel
    NCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NEURAL COMPUTATION THEORY AND APPLICATIONS, 2011, : IS23 - IS25
  • [17] Stringing High-Dimensional Data for Functional Analysis
    Chen, Kun
    Chen, Kehui
    Mueller, Hans-Georg
    Wang, Jane-Ling
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (493) : 275 - 284
  • [18] QUADRATIC DISCRIMINANT ANALYSIS FOR HIGH-DIMENSIONAL DATA
    Wu, Yilei
    Qin, Yingli
    Zhu, Mu
    STATISTICA SINICA, 2019, 29 (02) : 939 - 960
  • [19] The Role Of Hubness in High-dimensional Data Analysis
    Tomasev, Nenad
    INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2014, 38 (04): : 387 - 388
  • [20] The role of hubness in high-dimensional data analysis
    Tomašev, Nenad
    Informatica (Slovenia), 2014, 38 (04): : 387 - 388