QuadrupletBERT: An Efficient Model For Embedding-Based Large-Scale Retrieval

被引:0
|
作者
Liu, Peiyang [1 ,2 ]
Wang, Sen [3 ]
Wang, Xi [2 ]
Ye, Wei [1 ]
Zhang, Shikun [1 ]
机构
[1] Peking Univ, Natl Engn Res Ctr Software Engn, Beijing, Peoples R China
[2] Peking Univ, Sch Software & Microelectron, Beijing, Peoples R China
[3] PX Secur, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The embedding-based large-scale query-document retrieval problem is a hot topic in the information retrieval (IR) field. Considering that pre-trained language models like BERT have achieved great success in a wide variety of NLP tasks, we present a QuadrupletBERT model for effective and efficient retrieval in this paper. Unlike most existing BERT-style retrieval models, which only focus on the ranking phase in retrieval systems, our model makes considerable improvements to the retrieval phase and leverages the distances between simple negative and hard negative instances to obtaining better embeddings. Experimental results demonstrate that our QuadrupletBERT achieves state-of-the-art results in embedding-based large-scale retrieval tasks.
引用
收藏
页码:3734 / 3739
页数:6
相关论文
共 50 条
  • [1] Improving Embedding-based Large-scale Retrieval via Label Enhancement
    Liu, Peiyang
    Wang, Xi
    Wang, Sen
    Ye, Wei
    Xi, Xiangyu
    Zhang, Shikun
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 133 - 142
  • [2] PEFA: Parameter-Free Adapters for Large-scale Embedding-based Retrieval Models
    Chang, Wei-Cheng
    Jiang, Jyun-Yu
    Zhang, Jiong
    Al-Darabsah, Mutasem
    Teo, Choon Hui
    Hsieh, Cho-Jui
    Yu, Hsiang-Fu
    Vishwanathan, S. V. N.
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 77 - 86
  • [3] Binary Embedding-based Retrieval at Tencent
    Gan, Yukang
    Ge, Yixiao
    Zhou, Chang
    Su, Shupeng
    Xu, Zhouchuan
    Xu, Xuyuan
    Hui, Quanchao
    Chen, Xiang
    Wang, Yexin
    Shan, Ying
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 4056 - 4067
  • [4] Embedding-based Retrieval in Facebook Search
    Huang, Jui-Ting
    Sharma, Ashish
    Sun, Shuying
    Xia, Li
    Zhang, David
    Pronin, Philip
    Padmanabhan, Janani
    Ottaviano, Giuseppe
    Yang, Linjun
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 2553 - 2561
  • [5] Coupled Binary Embedding for Large-Scale Image Retrieval
    Zheng, Liang
    Wang, Shengjin
    Tian, Qi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (08) : 3368 - 3380
  • [6] Efficient Supervised Graph Embedding Hashing for large-scale cross-media retrieval
    Yao, Tao
    Wang, Ruxin
    Wang, Jintao
    Li, Ying
    Yue, Jun
    Yan, Lianshan
    Tian, Qi
    PATTERN RECOGNITION, 2024, 145
  • [7] Embedding-based Query Expansion for Weighted Sequential Dependence Retrieval Model
    Balaneshin-kordan, Saeid
    Kotov, Alexander
    SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 1213 - 1216
  • [8] Embedding-based Product Retrieval in Taobao Search
    Li, Sen
    Lv, Fuyu
    Jin, Taiwei
    Lin, Guli
    Yang, Keping
    Zeng, Xiaoyi
    Wu, Xiao-Ming
    Ma, Qianli
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 3181 - 3189
  • [9] Forward Compatible Training for Large-Scale Embedding Retrieval Systems
    Ramanujan, Vivek
    Vasu, Pavan Kumar Anasosalu
    Farhadi, Ali
    Tuzel, Oncel
    Pouransari, Hadi
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19364 - 19373
  • [10] Random Projection Tree and Multiview Embedding for Large-Scale Image Retrieval
    Xie, Bo
    Mu, Yang
    Song, Mingli
    Tao, Dacheng
    NEURAL INFORMATION PROCESSING: MODELS AND APPLICATIONS, PT II, 2010, 6444 : 641 - +