Analysis of Incomplete Data and an Intrinsic-Dimension Helly Theorem

被引:0
|
作者
Jie Gao
Michael Langberg
Leonard J. Schulman
机构
[1] Stony Brook University,Department of Computer Science
[2] The Open University of Israel,Computer Science Division
[3] California Institute of Technology,Department of Computer Science
来源
关键词
Clustering; -center; Core set; Incomplete data; Helly theorem; Approximation; Inference;
D O I
暂无
中图分类号
学科分类号
摘要
The analysis of incomplete data is a long-standing challenge in practical statistics. When, as is typical, data objects are represented by points in ℝd, incomplete data objects correspond to affine subspaces (lines or Δ-flats). With this motivation we study the problem of finding the minimum intersection radiusr(ℒ) of a set of lines or Δ-flats ℒ: the least r such that there is a ball of radius r intersecting every flat in ℒ. Known algorithms for finding the minimum enclosing ball for a point set (or clustering by several balls) do not easily extend to higher-dimensional flats, primarily because “distances” between flats do not satisfy the triangle inequality. In this paper we show how to restore geometry (i.e., a substitute for the triangle inequality) to the problem, through a new analog of Helly’s theorem. This “intrinsic-dimension” Helly theorem states: for any family ℒ of Δ-dimensional convex sets in a Hilbert space, there exist Δ+2 sets ℒ′⊆ℒ such that r(ℒ)≤2r(ℒ′). Based upon this we present an algorithm that computes a (1+ε)-core set ℒ′⊆ℒ, |ℒ′|=O(Δ4/ε), such that the ball centered at a point c with radius (1+ε)r(ℒ′) intersects every element of ℒ. The running time of the algorithm is O(nΔ+1dpoly (Δ/ε)). For the case of lines or line segments (Δ=1), the (expected) running time of the algorithm can be improved to O(ndpoly (1/ε)). We note that the size of the core set depends only on the dimension of the input objects and is independent of the input size n and the dimension d of the ambient space.
引用
收藏
页码:537 / 560
页数:23
相关论文
共 50 条
  • [1] Analysis of Incomplete Data and an Intrinsic-Dimension Helly Theorem
    Gao, Jie
    Langberg, Michael
    Schulman, Leonard J.
    PROCEEDINGS OF THE SEVENTHEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2006, : 464 - +
  • [2] Analysis of Incomplete Data and an Intrinsic-Dimension Helly Theorem
    Gao, Jie
    Langberg, Michael
    Schulman, Leonard J.
    DISCRETE & COMPUTATIONAL GEOMETRY, 2008, 40 (04) : 537 - 560
  • [3] Intrinsic-dimension analysis for guiding dimensionality reduction and data fusion in multi-omics data processing
    Gliozzo, Jessica
    Soto-Gomez, Mauricio
    Guarino, Valentina
    Bonometti, Arturo
    Cabri, Alberto
    Cavalleri, Emanuele
    Reese, Justin
    Robinson, Peter N.
    Mesiti, Marco
    Valentini, Giorgio
    Casiraghi, Elena
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2025, 160
  • [5] Bounded VC-Dimension Implies a Fractional Helly Theorem
    Jirí Matousek
    Discrete & Computational Geometry, 2004, 31 : 251 - 255
  • [6] Bounded VC-dimension implies a fractional Helly theorem
    Matousek, J
    DISCRETE & COMPUTATIONAL GEOMETRY, 2004, 31 (02) : 251 - 255
  • [7] INTRINSIC DIMENSION OF GEOMETRIC DATA SETS
    Hanika, Tom
    Schneider, Friedrich Martin
    Stumme, Gerd
    TOHOKU MATHEMATICAL JOURNAL, 2022, 74 (01) : 23 - 52
  • [8] Probabilistic Similarity Query on Dimension Incomplete Data
    Cheng, Wei
    Jin, Xiaoming
    Sun, Jian-Tao
    2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, : 81 - +
  • [9] Data segmentation based on the local intrinsic dimension
    Allegra, Michele
    Facco, Elena
    Denti, Francesco
    Laio, Alessandro
    Mira, Antonietta
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [10] Intrinsic dimension estimation for locally undersampled data
    Erba, Vittorio
    Gherardi, Marco
    Rotondo, Pietro
    SCIENTIFIC REPORTS, 2019, 9 (1)