Improving Embedding-based Large-scale Retrieval via Label Enhancement

被引:0
|
作者
Liu, Peiyang [1 ,2 ]
Wang, Xi [2 ]
Wang, Sen [2 ]
Ye, Wei [1 ]
Xi, Xiangyu [1 ,3 ]
Zhang, Shikun [1 ]
机构
[1] Peking Univ, Natl Engn Res Ctr Software Engn, Beijing, Peoples R China
[2] PX Secur, Beijing, Peoples R China
[3] Meituan Dianping Grp, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Current embedding-based large-scale retrieval models are trained with 0-1 hard label that indicates whether a query is relevant to a document, ignoring rich information of the relevance degree. This paper proposes to improve embedding-based retrieval from the perspective of better characterizing the query-document relevance degree by introducing label enhancement (LE) for the first time. To generate label distribution in the retrieval scenario, we design a novel and effective supervised LE method that incorporates prior knowledge from dynamic term weighting methods into contextual embeddings. Our method significantly outperforms four competitive existing retrieval models and its counterparts equipped with two alternative LE techniques by training models with the generated label distribution as auxiliary supervision information. The superiority can be easily observed on English and Chinese large-scale retrieval tasks under both standard and cold-start settings.
引用
收藏
页码:133 / 142
页数:10
相关论文
共 50 条
  • [21] Improving embedding-based link prediction performance using clustering
    Susanti, Fitri
    Maulidevi, Nur Ulfa
    Surendro, Kridanto
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2024, 36 (08)
  • [22] Large Scale Near-duplicate Image Retrieval via Patch Embedding
    Yan, Shangpeng
    Zhang, Xiaoyun
    Bao, Wenbo
    Chen, Li
    Gao, Zhiyong
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2972 - 2979
  • [23] Large-scale phase retrieval
    Xuyang Chang
    Liheng Bian
    Jun Zhang
    eLight, 1
  • [24] Large-scale phase retrieval
    Chang, Xuyang
    Bian, Liheng
    Zhang, Jun
    ELIGHT, 2021, 1 (01):
  • [25] Large-scale phase retrieval
    Popescu, Gabriel
    LIGHT-SCIENCE & APPLICATIONS, 2021, 10 (01)
  • [26] Label guided correlation hashing for large-scale cross-modal retrieval
    Guohua Dong
    Xiang Zhang
    Long Lan
    Shiwei Wang
    Zhigang Luo
    Multimedia Tools and Applications, 2019, 78 : 30895 - 30922
  • [27] Large-scale phase retrieval
    Gabriel Popescu
    Light: Science & Applications, 10
  • [28] Label guided correlation hashing for large-scale cross-modal retrieval
    Dong, Guohua
    Zhang, Xiang
    Lan, Long
    Wang, Shiwei
    Luo, Zhigang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (21) : 30895 - 30922
  • [29] Deep Supervised Hashing for Multi-Label and Large-Scale Image Retrieval
    Wu, Dayan
    Lin, Zheng
    Li, Bo
    Ye, Mingzhen
    Wang, Weiping
    PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR'17), 2017, : 155 - 163
  • [30] A Comparison Between Term-Based and Embedding-Based Methods for Initial Retrieval
    Guo, Tonglei
    Guo, Jiafeng
    Fan, Yixing
    Lan, Yanyan
    Xu, Jun
    Cheng, Xueqi
    INFORMATION RETRIEVAL, CCIR 2018, 2018, 11168 : 28 - 40