Nearest neighbor search on vertically partitioned high-dimensional data

被引:0
|
作者
Dellis, E [1 ]
Seeger, B [1 ]
Vlachou, A [1 ]
机构
[1] Univ Marburg, Dept Math & Comp Sci, D-35032 Marburg, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a new approach to indexing multidimensional data that is particularly suitable for the efficient incremental processing of nearest neighbor queries. The basic idea is to use index-striping that vertically splits the data space into multiple low- and medium-dimensional data spaces. The data from each of these lower-dimensional subspaces is organized by using a standard multi-dimensional index structure. In order to perform incremental NN-queries on top of index-striping efficiently, we first develop an algorithm for merging the results received from the underlying indexes. Then, an accurate cost model relying on a power law is presented that determines an appropriate number of indexes. Moreover, we consider the problem of dimension assignment, where each dimension is assigned to a lower-dimensional subspace, such that the cost of nearest neighbor queries is minimized. Our experiments confirm the validity of our cost model and evaluate the performance of our approach.
引用
收藏
页码:243 / 253
页数:11
相关论文
共 50 条
  • [21] Exploit Every Bit: Effective Caching for High-Dimensional Nearest Neighbor Search
    Tang, Bo
    Yiu, Man Lung
    Hua, Kien A.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (05) : 1175 - 1188
  • [22] High-Dimensional Nearest Neighbor Search-Based Blocking in Entity Resolution
    Zhang, Kaiyu
    Sun, Chenchen
    Shen, Derong
    Nie, Tiezheng
    Kou, Yue
    WEB INFORMATION SYSTEMS AND APPLICATIONS, WISA 2024, 2024, 14883 : 215 - 226
  • [23] Efficient and Accurate Nearest Neighbor and Closest Pair Search in High-Dimensional Space
    Tao, Yufei
    Yi, Ke
    Sheng, Cheng
    Kalnis, Panos
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2010, 35 (03):
  • [24] ROBUST NEAREST-NEIGHBOR METHODS FOR CLASSIFYING HIGH-DIMENSIONAL DATA
    Chan, Yao-Ban
    Hall, Peter
    ANNALS OF STATISTICS, 2009, 37 (6A): : 3186 - 3203
  • [25] A Normality Test for High-dimensional Data Based on the Nearest Neighbor Approach
    Chen, Hao
    Xia, Yin
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (541) : 719 - 731
  • [26] An Optimal Proximity Method for Nearest Neighbor Search in High Dimensional Data
    Pasunuri, Raghunadh
    Venkaiah, Vadlamudi China
    PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING AND INFORMATICS (IC3I), 2016, : 479 - 483
  • [27] Effective optimizations of cluster-based nearest neighbor search in high-dimensional space
    Xiaokang Feng
    Jiangtao Cui
    Yingfan Liu
    Hui Li
    Multimedia Systems, 2017, 23 : 139 - 153
  • [28] An efficient subspace sampling framework for high-dimensional data reduction, selectivity estimation, and nearest-neighbor search
    Aggarwal, CC
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (10) : 1247 - 1262
  • [29] Towards meaningful high-dimensional nearest neighbor search by human-computer interaction
    Aggarwal, CC
    18TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2002, : 593 - 604
  • [30] Effective optimizations of cluster-based nearest neighbor search in high-dimensional space
    Feng, Xiaokang
    Cui, Jiangtao
    Liu, Yingfan
    Li, Hui
    MULTIMEDIA SYSTEMS, 2017, 23 (01) : 139 - 153