Array-index:: a plug&search K nearest neighbors method for high-dimensional data

被引:20
|
作者
Al Aghbari, Z [1 ]
机构
[1] Univ Sharjah, Dept Comp Sci, Sharjah, U Arab Emirates
关键词
indexing methods; image databases; KNN image search; array-index; plug&search method;
D O I
10.1016/j.datak.2004.06.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous algorithms of data partitioning methods (DPMs) to find the exact K-nearest neighbors (KNN) at high dimensions are outperformed by a linear scan method [J.M. Kleinberg, Two algorithms for nearest neighbor search in high dimensions, 29th ACM Symposium on Theory of computing, 1997; R. Weber, H.-J. Schek, S. Blott. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. in: Proc. of the 24th VLDB, USA, 1998]. In this paper, we present a "plug& search" method to greatly speed up the exact KNN search of existing DPMs. The idea is to linearize the data partitions produced by a DPM, rather than the points themselves, into a one-dimensional array-index, that is simple, compact and fast. Unlike most DPMs that support KNN search, which require storage space linear, or exponential [J.M. Kleinberg, Two algorithms for nearest neighbor search in high dimensions, 29th ACM Symposium on Theory of computing, 1997; M. Hagedoom, Nearest neighbors can be found efficiently if the dimension is small relative to the input size, ICDT 2003], in dimensions, the array-index requires a storage space that is linear in the number of mapped partitions. (C) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:333 / 352
页数:20
相关论文
共 50 条
  • [31] Sequential random k-nearest neighbor feature selection for high-dimensional data
    Park, Chan Hee
    Kim, Seoung Bum
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (05) : 2336 - 2342
  • [32] Fuzzy nearest neighbor clustering of high-dimensional data
    Wang, HB
    Yu, YQ
    Zhou, DR
    Meng, B
    2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 2569 - 2572
  • [33] Efficient k-Nearest Neighbors Search in High Dimensions using MapReduce
    Zhu, Pingfei
    Zhan, Xiangwen
    Qiu, Wenming
    PROCEEDINGS 2015 IEEE FIFTH INTERNATIONAL CONFERENCE ON BIG DATA AND CLOUD COMPUTING BDCLOUD 2015, 2015, : 23 - 30
  • [34] An Efficient Framework for Approximate Nearest Neighbor Search on High-Dimensional Multi-metric Data
    Uemura, Reon
    Amagata, Daichi
    Hara, Takahiro
    SIMILARITY SEARCH AND APPLICATIONS, SISAP 2024, 2025, 15268 : 3 - 17
  • [35] On the Behavior of Intrinsically High-Dimensional Spaces: Distances, Direct and Reverse Nearest Neighbors, and Hubness
    Angiulli, Fabrizio
    JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 18
  • [36] Feature Selection for High Dimensional Data Using Weighted K-Nearest Neighbors and Genetic Algorithm
    Li, Shuangjie
    Zhang, Kaixiang
    Chen, Qianru
    Wang, Shuqin
    Zhang, Shaoqiang
    IEEE ACCESS, 2020, 8 : 139512 - 139528
  • [37] Clustering high-dimensional data with low-order neighbors
    Zhao, YC
    Zhang, CQ
    Shen, YD
    IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2004), PROCEEDINGS, 2004, : 103 - 109
  • [38] Distance Encoded Product Quantization for Approximate K-Nearest Neighbor Search in High-Dimensional Space
    Heo, Jae-Pil
    Lin, Zhe
    Yoon, Sung-Eui
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (09) : 2084 - 2097
  • [39] HDIdx: High-dimensional indexing for efficient approximate nearest neighbor search
    Wan, Ji
    Tang, Sheng
    Zhang, Yongdong
    Li, Jintao
    Wu, Pengcheng
    Hoi, Steven C. H.
    NEUROCOMPUTING, 2017, 237 : 401 - 404
  • [40] DSH: Data Sensitive Hashing for High-Dimensional k-NN Search
    Gao, Jinyang
    Jagadish, H., V
    Lu, Wei
    Ooi, Beng Chin
    SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, : 1127 - 1138