Binary Embedding-based Retrieval at Tencent

被引:2
|
作者
Gan, Yukang [1 ]
Ge, Yixiao [1 ]
Zhou, Chang [2 ]
Su, Shupeng [1 ]
Xu, Zhouchuan [3 ]
Xu, Xuyuan [2 ]
Hui, Quanchao [3 ]
Chen, Xiang [3 ]
Wang, Yexin [2 ]
Shan, Ying [1 ,3 ]
机构
[1] Tencent PCG, ARC Lab, Shenzhen, Peoples R China
[2] Tencent Video, PCG, Shenzhen, Peoples R China
[3] Tencent Search, PCG, Shenzhen, Peoples R China
关键词
embedding-based retrieval; embedding binarization; backward compatibility;
D O I
10.1145/3580305.3599782
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Large-scale embedding-based retrieval (EBR) is the cornerstone of search-related industrial applications. Given a user query, the system of EBR aims to identify relevant information from a large corpus of documents that may be tens or hundreds of billions in size. The storage and computation turn out to be expensive and inefficient with massive documents and high concurrent queries, making it difficult to further scale up. To tackle the challenge, we propose a binary embedding-based retrieval (BEBR) engine equipped with a recurrent binarization algorithm that enables customized bits per dimension. Specifically, we compress the full-precision query and document embeddings, formulated as float vectors in general, into a composition of multiple binary vectors using a lightweight transformation model with residual multi-layer perception (MLP) blocks. The bits of transformed binary vectors are jointly determined by the output dimension of MLP blocks (termed..) and the number of residual blocks (termed u), i.e., m x (u + 1). We can therefore tailor the number of bits for different applications to trade off accuracy loss and cost savings. Importantly, we enable task-agnostic efficient training of the binarization model using a new embedding-to-embedding strategy, e.g., only 2 V100 GPU hours are required by millions of vectors for training. We also exploit the compatible training of binary embeddings so that the BEBR engine can support indexing among multiple embedding versions within a unified system. To further realize efficient search, we propose Symmetric Distance Calculation (SDC) to achieve lower response time than Hamming codes. The technique exploits Single Instruction Multiple Data (SIMD) units widely available in current CPUs. We successfully employed the introduced BEBR to web search and copyright detection of Tencent products, including Sogou, Tencent Video, QQ World, etc. The binarization algorithm can be seamlessly generalized to various tasks with multiple modalities, for instance, natural language processing (NLP) and computer vision (CV). Extensive experiments on offline benchmarks and online A/B tests demonstrate the efficiency and effectiveness of our method, significantly saving 30% similar to 50% index costs with almost no loss of accuracy at the system level(1).
引用
收藏
页码:4056 / 4067
页数:12
相关论文
共 50 条
  • [41] Word Embedding-Based Topic Similarity Measures
    Terragni, Silvia
    Fersini, Elisabetta
    Messina, Enza
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2021), 2021, 12801 : 33 - 45
  • [42] Divide and Conquer: Towards Beter Embedding-based Retrieval for Recommender Systems from a Multi-task Perspective
    Zhang, Yuan
    Dong, Xue
    Ding, Weijie
    Li, Biao
    Jiang, Peng
    Gai, Kun
    COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 366 - 370
  • [43] A Unified Binary Embedding Framework for Image Retrieval
    He, Yin
    Chen, Yi
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [44] Modeling Topic Evolution in Twitter: An Embedding-Based Approach
    Abulaish, Muhammad
    Fazil, Mohd
    IEEE ACCESS, 2018, 6 : 64847 - 64857
  • [45] Embedding-Based Comparison of Reaction Networks of Wnt Signaling
    Hernandez, Bryan S.
    Lubenia, Patrick Vincent N.
    Mendoza, Eduardo R.
    Qin, M.
    Li, Z.
    Sun, X.
    Yang, X.
    Izadi, M.
    Ahmad, H.
    Srivastava, H. M.
    Brinkmann, G.
    Buccoliero, F.
    Van den Camp, H.
    Agusfrianto, A.
    Mahatma, Y.
    Ambarwati, L.
    MATCH-COMMUNICATIONS IN MATHEMATICAL AND IN COMPUTER CHEMISTRY, 2025, 93 (01)
  • [46] Graph Embedding-Based Money Laundering Detection for Ethereum
    Liu, Jiayi
    Yin, Changchun
    Wang, Hao
    Wu, Xiaofei
    Lan, Dongwan
    Zhou, Lu
    Ge, Chunpeng
    ELECTRONICS, 2023, 12 (14)
  • [47] Unified Graph Embedding-Based Anomalous Edge Detection
    Ouyang, Linshu
    Zhang, Yongzheng
    Wang, Yipeng
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [48] Interactive optimization of embedding-based text similarity calculations
    Witschard, Daniel
    Jusufi, Ilir
    Martins, Rafael M.
    Kucher, Kostiantyn
    Kerren, Andreas
    INFORMATION VISUALIZATION, 2022, 21 (04) : 335 - 353
  • [49] Sequential opcode embedding-based malware detection method
    Kakisim, Arzu Gorgulu
    Gulmez, Sibel
    Sogukpinar, Ibrahim
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 98
  • [50] Graph embedding-based scalable routing in large networks
    Tang, Mingdong
    Zhang, Guoqing
    Yang, Jing
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2010, 47 (07): : 1225 - 1233