Nearest neighbor search on vertically partitioned high-dimensional data

被引:0
|
作者
Dellis, E [1 ]
Seeger, B [1 ]
Vlachou, A [1 ]
机构
[1] Univ Marburg, Dept Math & Comp Sci, D-35032 Marburg, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a new approach to indexing multidimensional data that is particularly suitable for the efficient incremental processing of nearest neighbor queries. The basic idea is to use index-striping that vertically splits the data space into multiple low- and medium-dimensional data spaces. The data from each of these lower-dimensional subspaces is organized by using a standard multi-dimensional index structure. In order to perform incremental NN-queries on top of index-striping efficiently, we first develop an algorithm for merging the results received from the underlying indexes. Then, an accurate cost model relying on a power law is presented that determines an appropriate number of indexes. Moreover, we consider the problem of dimension assignment, where each dimension is assigned to a lower-dimensional subspace, such that the cost of nearest neighbor queries is minimized. Our experiments confirm the validity of our cost model and evaluate the performance of our approach.
引用
收藏
页码:243 / 253
页数:11
相关论文
共 50 条
  • [41] Quality and Efficiency in High Dimensional Nearest Neighbor Search
    Tao, Yufei
    Yi, Ke
    Sheng, Cheng
    Kalnis, Panos
    ACM SIGMOD/PODS 2009 CONFERENCE, 2009, : 563 - 575
  • [42] A Sparse Reconstructive Evidential K-Nearest Neighbor Classifier for High-Dimensional Data
    Gong, Chaoyu
    Su, Zhi-Gang
    Wang, Pei-Hong
    Wang, Qian
    You, Yang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (06) : 5563 - 5576
  • [43] Sequential random k-nearest neighbor feature selection for high-dimensional data
    Park, Chan Hee
    Kim, Seoung Bum
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (05) : 2336 - 2342
  • [44] Randomized Embeddings with Slack and High-Dimensional Approximate Nearest Neighbor
    Anagnostopoulos, Evangelos
    Emiris, Ioannis Z.
    Psarros, Ioannis
    ACM TRANSACTIONS ON ALGORITHMS, 2018, 14 (02)
  • [45] Instability results for Euclidean distance, nearest neighbor search on high dimensional Gaussian data
    Giannella, Chris R.
    INFORMATION PROCESSING LETTERS, 2021, 169
  • [46] Distance Encoded Product Quantization for Approximate K-Nearest Neighbor Search in High-Dimensional Space
    Heo, Jae-Pil
    Lin, Zhe
    Yoon, Sung-Eui
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (09) : 2084 - 2097
  • [47] Hubness-aware shared neighbor distances for high-dimensional -nearest neighbor classification
    Tomasev, Nenad
    Mladenic, Dunja
    KNOWLEDGE AND INFORMATION SYSTEMS, 2014, 39 (01) : 89 - 122
  • [48] C-approximate nearest neighbor query algorithm based on learning for high-dimensional data
    Yuan, Pei-Sen
    Sha, Chao-Feng
    Wang, Xiao-Ling
    Zhou, Ao-Ying
    Ruan Jian Xue Bao/Journal of Software, 2012, 23 (08): : 2018 - 2031
  • [49] Efficient search for approximate nearest neighbor in high dimensional spaces
    Kushilevitz, E
    Ostrovsky, R
    Rabani, Y
    SIAM JOURNAL ON COMPUTING, 2000, 30 (02) : 457 - 474
  • [50] Nearest-neighbor-intersection algorithm for identifying strong predictors using high-dimensional data
    Roy, Arighna
    Denton, Anne
    2019 IEEE INTERNATIONAL CONFERENCE ON ELECTRO INFORMATION TECHNOLOGY (EIT), 2019, : 416 - 421