An efficient indexing technique for billion-scale nearest neighbor search

被引:0
|
作者
Kaixiang Yang
Hongya Wang
Ming Du
Zhizheng Wang
Zongyuan Tan
Jie Zhang
Yingyuan Xiao
机构
[1] Donghua University,School of Computer Science and Technology
[2] State Key Laboratory of Computer Architecture,Institute of Artificial Intelligence
[3] ICT,School of CSE
[4] CAS,undefined
[5] Shanghai Key Laboratory of Computer Software Evaluating and Testing,undefined
[6] Donghua University,undefined
[7] Tianjin University of Technology,undefined
来源
Multimedia Tools and Applications | 2023年 / 82卷
关键词
Approximate nearest neighbor search; Hierarchical navigable small world graph; Product quantization; Re-rank;
D O I
暂无
中图分类号
学科分类号
摘要
Approximate nearest neighbor search is an indispensable component in many computer vision applications. To index more data, such as images, on one commercial server, Douze et al. introduced L&C that works on operating points considering 64–128 bytes per vector. While the idea is inspiring, we observe that L&C still suffers the accuracy saturation problem, which it is aimed to solve. To this end, we propose a simple yet effective two-layer graph index structure, together with dual residual encoding, to attain higher accuracy. Particularly, we partition vectors into multiple clusters and build the top-layer graph using the corresponding centroids. For each cluster, a subgraph is created with compact codes of the first-level vector residuals. Such an index structure provides better graph search precision as well as saves quite a few bytes for compression. We employ the second-level residual quantization to re-rank the candidates obtained through graph traversal, which is more efficient than regression-from-neighbors adopted by L&C. Comprehensive experiments show that our proposal obtains over 10% and 30% higher recall@1 than the state-of-the-arts, and achieves up to 7.7x and 6.1x speedup over L&C on Deep1B and Sift1B, respectively. Our proposal also attains 90%+ recall@10 and recall@100 on two billion-sized datasets at the cost of 10ms per query.
引用
收藏
页码:31673 / 31689
页数:16
相关论文
共 50 条
  • [21] An efficient evolutionary algorithm with a nearest neighbor search technique for clustering analysis
    Raneem Qaddoura
    Hossam Faris
    Ibrahim Aljarah
    Journal of Ambient Intelligence and Humanized Computing, 2021, 12 : 8387 - 8412
  • [22] Efficient Learning for Billion-Scale Heterogeneous Information Networks
    Shi, Ruize
    Huang, Hong
    Lin, Xue
    Yin, Kehan
    Zhou, Wei
    Jin, Hai
    IEEE TRANSACTIONS ON BIG DATA, 2025, 11 (02) : 748 - 760
  • [23] AN EFFICIENT NEAREST NEIGHBOR SEARCH METHOD
    SOLEYMANI, MR
    MORGERA, SD
    IEEE TRANSACTIONS ON COMMUNICATIONS, 1987, 35 (06) : 677 - 679
  • [24] Vector and line quantization for billion-scale similarity search on GPUs
    Chen, Wei
    Chen, Jincai
    Zou, Fuhao
    Li, Yuan-Fang
    Lu, Ping
    Wang, Qiang
    Zhao, Wei
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 99 : 295 - 307
  • [25] Efficient Large-scale Approximate Nearest Neighbor Search on the GPU
    Wieschollek, Patrick
    Wang, Oliver
    Sorkine-Hornung, Alexander
    Lensch, Hendrik P. A.
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2027 - 2035
  • [26] HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory
    Ren, Jie
    Zhang, Minjia
    Li, Dong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [27] Indexing the solution space: A new technique for nearest neighbor search in high-dimensional space
    Berchtold, S
    Keim, DA
    Kriegel, HP
    Seidl, T
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2000, 12 (01) : 45 - 57
  • [28] Synchronizing billion-scale automata
    Tas, Mustafa Kemal
    Kaya, Kamer
    Yenigun, Husnu
    INFORMATION SCIENCES, 2021, 574 : 162 - 175
  • [29] RobustiQ: A Robust ANN Search Method for Billion-scale Similarity Search on GPUs
    Chen, Wei
    Chen, Jincai
    Zou, Fuhao
    Li, Yuan-Fang
    Lu, Ping
    Zhao, Wei
    ICMR'19: PROCEEDINGS OF THE 2019 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2019, : 132 - 140
  • [30] Efficient structural node similarity computation on billion-scale graphs
    Xiaoshuang Chen
    Longbin Lai
    Lu Qin
    Xuemin Lin
    The VLDB Journal, 2021, 30 : 471 - 493