What is the Intrinsic Dimension of Your Binary Data?-and How to Compute it Quickly

被引:0
|
作者
Hanika, Tom [1 ]
Hille, Tobias [2 ,3 ]
机构
[1] Univ Hildesheim, Intelligent Informat Syst, Hildesheim, Germany
[2] Univ Kassel, Knowledge & Data Engn Grp, Kassel, Germany
[3] Univ Kassel, Interdisciplinary Res Ctr Informat Syst Design, Kassel, Germany
关键词
intrinsic dimension; high-dimensional data; binary data; extrinsic dimension;
D O I
10.1007/978-3-031-67868-4_7
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Dimensionality is an important aspect for analyzing and understanding (high-dimensional) data. In their 2006 ICDM paper Tatti et al. answered the question for a (interpretable) dimension of binary data tables by introducing a normalized correlation dimension. In the present work we revisit their results and contrast them with a concept based notion of intrinsic dimension (ID) recently introduced for geometric data sets. To do this, we present a novel approximation for this ID that is based on computing concepts only up to a certain support value. We demonstrate and evaluate our approximation using all available datasets from Tatti et al., which have between 469 and 41271 extrinsic dimensions. (Source code and more figures are available at https://codeberg.org/thille/bd-gid).
引用
收藏
页码:97 / 112
页数:16
相关论文
共 50 条
  • [21] What is your Data Worth?
    Geere D.
    Hariharasegaran D.
    ITNOW, 2019, 61 (01) : 36 - 37
  • [22] How to Securely Compute the Modulo-Two Sum of Binary Sources
    Data, Deepesh
    Dey, Bikash K.
    Mishra, Manoj
    Prabhakaran, Vinod M.
    2014 IEEE INFORMATION THEORY WORKSHOP (ITW), 2014, : 496 - 500
  • [23] What makes a Life?: How quickly can you capture it? What remains? At the End?
    Radtke, Karin
    ZEITSCHRIFT FUR PALLIATIVMEDIZIN, 2021, 22 (06): : 294 - 295
  • [24] WHAT TO TELL YOUR TOOLMAKER - AND HOW
    不详
    BRITISH PLASTICS AND RUBBER, 1978, (MAY): : 71 - 72
  • [25] How robust is your data?
    不详
    NATURE CELL BIOLOGY, 2009, 11 (06) : 667 - 667
  • [26] How to Monetize Your Data
    Wixom, Barbara H.
    Ross, Jeanne W.
    MIT SLOAN MANAGEMENT REVIEW, 2017, 58 (03) : 10 - 13
  • [27] How good are your data?
    Barnicki, SD
    CHEMICAL ENGINEERING PROGRESS, 2002, 98 (06) : 58 - 67
  • [28] How secure is your data?
    Shurtleff, Jane
    COMMUNICATIONS NEWS, 2008, 45 (01): : 35 - 35
  • [29] How raw are your data?
    McDowall, RD
    LC GC-MAGAZINE OF SEPARATION SCIENCE, 1997, 15 (04): : 346 - &
  • [30] HOW GOOD ARE YOUR DATA
    GILBEY, J
    NEW SCIENTIST, 1988, 118 (1616) : 80 - 80