An efficient indexing technique for billion-scale nearest neighbor search

被引:0
|
作者
Kaixiang Yang
Hongya Wang
Ming Du
Zhizheng Wang
Zongyuan Tan
Jie Zhang
Yingyuan Xiao
机构
[1] Donghua University,School of Computer Science and Technology
[2] State Key Laboratory of Computer Architecture,Institute of Artificial Intelligence
[3] ICT,School of CSE
[4] CAS,undefined
[5] Shanghai Key Laboratory of Computer Software Evaluating and Testing,undefined
[6] Donghua University,undefined
[7] Tianjin University of Technology,undefined
来源
Multimedia Tools and Applications | 2023年 / 82卷
关键词
Approximate nearest neighbor search; Hierarchical navigable small world graph; Product quantization; Re-rank;
D O I
暂无
中图分类号
学科分类号
摘要
Approximate nearest neighbor search is an indispensable component in many computer vision applications. To index more data, such as images, on one commercial server, Douze et al. introduced L&C that works on operating points considering 64–128 bytes per vector. While the idea is inspiring, we observe that L&C still suffers the accuracy saturation problem, which it is aimed to solve. To this end, we propose a simple yet effective two-layer graph index structure, together with dual residual encoding, to attain higher accuracy. Particularly, we partition vectors into multiple clusters and build the top-layer graph using the corresponding centroids. For each cluster, a subgraph is created with compact codes of the first-level vector residuals. Such an index structure provides better graph search precision as well as saves quite a few bytes for compression. We employ the second-level residual quantization to re-rank the candidates obtained through graph traversal, which is more efficient than regression-from-neighbors adopted by L&C. Comprehensive experiments show that our proposal obtains over 10% and 30% higher recall@1 than the state-of-the-arts, and achieves up to 7.7x and 6.1x speedup over L&C on Deep1B and Sift1B, respectively. Our proposal also attains 90%+ recall@10 and recall@100 on two billion-sized datasets at the cost of 10ms per query.
引用
收藏
页码:31673 / 31689
页数:16
相关论文
共 50 条
  • [31] Efficient Large-scale Approximate Nearest Neighbor Search on OpenCL FPGA
    Zhang, Jialiang
    Khoram, Soroosh
    Li, Jing
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4924 - 4932
  • [32] Strategies for efficient incremental nearest neighbor search
    Broder, Alan J.
    Pattern Recognition, 1990, 23 (1-2): : 171 - 178
  • [33] Efficient Graph Summarization using Weighted LSH at Billion-Scale
    Yong, Quinton
    Hajiabadi, Mahdi
    Srinivasan, Venkatesh
    Thomo, Alex
    SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 2357 - 2365
  • [34] Efficient structural node similarity computation on billion-scale graphs
    Chen, Xiaoshuang
    Lai, Longbin
    Qin, Lu
    Lin, Xuemin
    VLDB JOURNAL, 2021, 30 (03): : 471 - 493
  • [35] STRATEGIES FOR EFFICIENT INCREMENTAL NEAREST NEIGHBOR SEARCH
    BRODER, AJ
    PATTERN RECOGNITION, 1990, 23 (1-2) : 171 - 178
  • [36] Efficient indexing of binary LSH for high dimensional nearest neighbor
    Zhang, Xiaoyu
    Wang, Manlin
    Cui, Jiangtao
    NEUROCOMPUTING, 2016, 213 : 24 - 33
  • [37] SPFRESH: Incremental In-Place Update for Billion-Scale Vector Search
    Xu, Yuming
    Liang, Hengyu
    Li, Jin
    Xu, Shuotao
    Chen, Qi
    Zhang, Qianxi
    Li, Cheng
    Yang, Ziyue
    Yang, Fan
    Yang, Yuqing
    Cheng, Peng
    Yang, Mao
    PROCEEDINGS OF THE TWENTY-NINTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, SOSP 2023, 2023, : 545 - 561
  • [38] Efficient MapReduce algorithms for triangle listing in billion-scale graphs
    Yuanyuan Zhu
    Hao Zhang
    Lu Qin
    Hong Cheng
    Distributed and Parallel Databases, 2017, 35 : 149 - 176
  • [39] Efficient MapReduce algorithms for triangle listing in billion-scale graphs
    Zhu, Yuanyuan
    Zhang, Hao
    Qin, Lu
    Cheng, Hong
    DISTRIBUTED AND PARALLEL DATABASES, 2017, 35 (02) : 149 - 176
  • [40] Effective product quantization-based indexing for nearest neighbor search
    Chiu, Chih-Yi
    Chiu, Jih-Sheng
    Markchit, Sarawut
    Chou, Sheng-Hao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (03) : 2877 - 2895