Binary Embedding-based Retrieval at Tencent

被引：2

作者：

Gan, Yukang ^{[1
]}

Ge, Yixiao ^{[1
]}

Zhou, Chang ^{[2
]}

Su, Shupeng ^{[1
]}

Xu, Zhouchuan ^{[3
]}

Xu, Xuyuan ^{[2
]}

Hui, Quanchao ^{[3
]}

Chen, Xiang ^{[3
]}

Wang, Yexin ^{[2
]}

Shan, Ying ^{[1
,3
]}

机构：

[1] Tencent PCG, ARC Lab, Shenzhen, Peoples R China

[2] Tencent Video, PCG, Shenzhen, Peoples R China

[3] Tencent Search, PCG, Shenzhen, Peoples R China

来源：

PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023 | 2023年

关键词：

embedding-based retrieval; embedding binarization; backward compatibility;

D O I：

10.1145/3580305.3599782

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Large-scale embedding-based retrieval (EBR) is the cornerstone of search-related industrial applications. Given a user query, the system of EBR aims to identify relevant information from a large corpus of documents that may be tens or hundreds of billions in size. The storage and computation turn out to be expensive and inefficient with massive documents and high concurrent queries, making it difficult to further scale up. To tackle the challenge, we propose a binary embedding-based retrieval (BEBR) engine equipped with a recurrent binarization algorithm that enables customized bits per dimension. Specifically, we compress the full-precision query and document embeddings, formulated as float vectors in general, into a composition of multiple binary vectors using a lightweight transformation model with residual multi-layer perception (MLP) blocks. The bits of transformed binary vectors are jointly determined by the output dimension of MLP blocks (termed..) and the number of residual blocks (termed u), i.e., m x (u + 1). We can therefore tailor the number of bits for different applications to trade off accuracy loss and cost savings. Importantly, we enable task-agnostic efficient training of the binarization model using a new embedding-to-embedding strategy, e.g., only 2 V100 GPU hours are required by millions of vectors for training. We also exploit the compatible training of binary embeddings so that the BEBR engine can support indexing among multiple embedding versions within a unified system. To further realize efficient search, we propose Symmetric Distance Calculation (SDC) to achieve lower response time than Hamming codes. The technique exploits Single Instruction Multiple Data (SIMD) units widely available in current CPUs. We successfully employed the introduced BEBR to web search and copyright detection of Tencent products, including Sogou, Tencent Video, QQ World, etc. The binarization algorithm can be seamlessly generalized to various tasks with multiple modalities, for instance, natural language processing (NLP) and computer vision (CV). Extensive experiments on offline benchmarks and online A/B tests demonstrate the efficiency and effectiveness of our method, significantly saving 30% similar to 50% index costs with almost no loss of accuracy at the system level(1).

引用

页码：4056 / 4067

页数：12

共 50 条

[21] An embedding-based distance for temporal graphs
Dall'Amico, Lorenzo
Barrat, Alain
Cattuto, Ciro
NATURE COMMUNICATIONS, 2024, 15 (01)
[22] Embedding-based Instance Segmentation in Microscopy
Lalit, Manan
Tomancak, Pavel
Jug, Florian
MEDICAL IMAGING WITH DEEP LEARNING, VOL 143, 2021, 143 : 399 - 415
[23] PEFA: Parameter-Free Adapters for Large-scale Embedding-based Retrieval Models
Chang, Wei-Cheng
Jiang, Jyun-Yu
Zhang, Jiong
Al-Darabsah, Mutasem
Teo, Choon Hui
Hsieh, Cho-Jui
Yu, Hsiang-Fu
Vishwanathan, S. V. N.
PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 77 - 86
[24] Lbl2Vec: An Embedding-based Approach for Unsupervised Document Retrieval on Predefined Topics
Schopf, Tim
Braun, Daniel
Matthes, Florian
PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES (WEBIST), 2021, : 124 - 132
[25] Integrity and Junkiness Failure Handling for Embedding-based Retrieval: A Case Study in Social Network Search
Wang, Wenping
Guo, Yunxi
Shen, Chiyao
Ding, Shuai
Liao, Guangdeng
Fu, Hao
Prabhakar, Pramodh Karanth
PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 3250 - 3254
[26] Qe2Engage: Embedding-based Retrieval for Relevant and Engaging Products at Facebook Marketplace
He, Yunzhong
Tian, Yuxin
Wang, Mengjiao
Chen, Feier
Yu, Licheng
Tang, Maolong
Chen, Congcong
Zhang, Ning
Kuang, Bin
Prakash, Arul
COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 386 - 390
[27] Embedding-Based Recommendations on Scholarly Knowledge Graphs
Nayyeri, Mojtaba
Vahdati, Sahar
Zhou, Xiaotian
Yazdi, Hamed Shariat
Lehmann, Jens
SEMANTIC WEB (ESWC 2020), 2020, 12123 : 255 - 270
[28] An Embedding-Based Approach to Repairing OWL Ontologies
Ji, Qiu
Qi, Guilin
Yang, Yinkai
Li, Weizhuo
Huang, Siying
Sheng, Yang
APPLIED SCIENCES-BASEL, 2022, 12 (24):
[29] Explanations for Network Embedding-Based Link Predictions
Kang, Bo
Lijffijt, Jefrey
De Bie, Tijl
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021, PT I, 2021, 1524 : 473 - 488
[30] Embedding-based News Recommendation for Millions of Users
Okura, Shumpei
Tagami, Yukihiro
Ono, Shingo
Tajima, Akira
KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 1933 - 1942

← 1 2 3 4 5 →