Binary Embedding-based Retrieval at Tencent

被引:2
|
作者
Gan, Yukang [1 ]
Ge, Yixiao [1 ]
Zhou, Chang [2 ]
Su, Shupeng [1 ]
Xu, Zhouchuan [3 ]
Xu, Xuyuan [2 ]
Hui, Quanchao [3 ]
Chen, Xiang [3 ]
Wang, Yexin [2 ]
Shan, Ying [1 ,3 ]
机构
[1] Tencent PCG, ARC Lab, Shenzhen, Peoples R China
[2] Tencent Video, PCG, Shenzhen, Peoples R China
[3] Tencent Search, PCG, Shenzhen, Peoples R China
关键词
embedding-based retrieval; embedding binarization; backward compatibility;
D O I
10.1145/3580305.3599782
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Large-scale embedding-based retrieval (EBR) is the cornerstone of search-related industrial applications. Given a user query, the system of EBR aims to identify relevant information from a large corpus of documents that may be tens or hundreds of billions in size. The storage and computation turn out to be expensive and inefficient with massive documents and high concurrent queries, making it difficult to further scale up. To tackle the challenge, we propose a binary embedding-based retrieval (BEBR) engine equipped with a recurrent binarization algorithm that enables customized bits per dimension. Specifically, we compress the full-precision query and document embeddings, formulated as float vectors in general, into a composition of multiple binary vectors using a lightweight transformation model with residual multi-layer perception (MLP) blocks. The bits of transformed binary vectors are jointly determined by the output dimension of MLP blocks (termed..) and the number of residual blocks (termed u), i.e., m x (u + 1). We can therefore tailor the number of bits for different applications to trade off accuracy loss and cost savings. Importantly, we enable task-agnostic efficient training of the binarization model using a new embedding-to-embedding strategy, e.g., only 2 V100 GPU hours are required by millions of vectors for training. We also exploit the compatible training of binary embeddings so that the BEBR engine can support indexing among multiple embedding versions within a unified system. To further realize efficient search, we propose Symmetric Distance Calculation (SDC) to achieve lower response time than Hamming codes. The technique exploits Single Instruction Multiple Data (SIMD) units widely available in current CPUs. We successfully employed the introduced BEBR to web search and copyright detection of Tencent products, including Sogou, Tencent Video, QQ World, etc. The binarization algorithm can be seamlessly generalized to various tasks with multiple modalities, for instance, natural language processing (NLP) and computer vision (CV). Extensive experiments on offline benchmarks and online A/B tests demonstrate the efficiency and effectiveness of our method, significantly saving 30% similar to 50% index costs with almost no loss of accuracy at the system level(1).
引用
收藏
页码:4056 / 4067
页数:12
相关论文
共 50 条
  • [21] An embedding-based distance for temporal graphs
    Dall'Amico, Lorenzo
    Barrat, Alain
    Cattuto, Ciro
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [22] Embedding-based Instance Segmentation in Microscopy
    Lalit, Manan
    Tomancak, Pavel
    Jug, Florian
    MEDICAL IMAGING WITH DEEP LEARNING, VOL 143, 2021, 143 : 399 - 415
  • [23] PEFA: Parameter-Free Adapters for Large-scale Embedding-based Retrieval Models
    Chang, Wei-Cheng
    Jiang, Jyun-Yu
    Zhang, Jiong
    Al-Darabsah, Mutasem
    Teo, Choon Hui
    Hsieh, Cho-Jui
    Yu, Hsiang-Fu
    Vishwanathan, S. V. N.
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 77 - 86
  • [24] Lbl2Vec: An Embedding-based Approach for Unsupervised Document Retrieval on Predefined Topics
    Schopf, Tim
    Braun, Daniel
    Matthes, Florian
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES (WEBIST), 2021, : 124 - 132
  • [25] Integrity and Junkiness Failure Handling for Embedding-based Retrieval: A Case Study in Social Network Search
    Wang, Wenping
    Guo, Yunxi
    Shen, Chiyao
    Ding, Shuai
    Liao, Guangdeng
    Fu, Hao
    Prabhakar, Pramodh Karanth
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 3250 - 3254
  • [26] Qe2Engage: Embedding-based Retrieval for Relevant and Engaging Products at Facebook Marketplace
    He, Yunzhong
    Tian, Yuxin
    Wang, Mengjiao
    Chen, Feier
    Yu, Licheng
    Tang, Maolong
    Chen, Congcong
    Zhang, Ning
    Kuang, Bin
    Prakash, Arul
    COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 386 - 390
  • [27] Embedding-Based Recommendations on Scholarly Knowledge Graphs
    Nayyeri, Mojtaba
    Vahdati, Sahar
    Zhou, Xiaotian
    Yazdi, Hamed Shariat
    Lehmann, Jens
    SEMANTIC WEB (ESWC 2020), 2020, 12123 : 255 - 270
  • [28] An Embedding-Based Approach to Repairing OWL Ontologies
    Ji, Qiu
    Qi, Guilin
    Yang, Yinkai
    Li, Weizhuo
    Huang, Siying
    Sheng, Yang
    APPLIED SCIENCES-BASEL, 2022, 12 (24):
  • [29] Explanations for Network Embedding-Based Link Predictions
    Kang, Bo
    Lijffijt, Jefrey
    De Bie, Tijl
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021, PT I, 2021, 1524 : 473 - 488
  • [30] Embedding-based News Recommendation for Millions of Users
    Okura, Shumpei
    Tagami, Yukihiro
    Ono, Shingo
    Tajima, Akira
    KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 1933 - 1942