Improving Embedding-based Large-scale Retrieval via Label Enhancement

被引:0
|
作者
Liu, Peiyang [1 ,2 ]
Wang, Xi [2 ]
Wang, Sen [2 ]
Ye, Wei [1 ]
Xi, Xiangyu [1 ,3 ]
Zhang, Shikun [1 ]
机构
[1] Peking Univ, Natl Engn Res Ctr Software Engn, Beijing, Peoples R China
[2] PX Secur, Beijing, Peoples R China
[3] Meituan Dianping Grp, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Current embedding-based large-scale retrieval models are trained with 0-1 hard label that indicates whether a query is relevant to a document, ignoring rich information of the relevance degree. This paper proposes to improve embedding-based retrieval from the perspective of better characterizing the query-document relevance degree by introducing label enhancement (LE) for the first time. To generate label distribution in the retrieval scenario, we design a novel and effective supervised LE method that incorporates prior knowledge from dynamic term weighting methods into contextual embeddings. Our method significantly outperforms four competitive existing retrieval models and its counterparts equipped with two alternative LE techniques by training models with the generated label distribution as auxiliary supervision information. The superiority can be easily observed on English and Chinese large-scale retrieval tasks under both standard and cold-start settings.
引用
收藏
页码:133 / 142
页数:10
相关论文
共 50 条
  • [41] Large-Scale Video Retrieval via Deep Local Convolutional Features
    Zhang, Chen
    Hu, Bin
    Suo, Yucong
    Zou, Zhiqiang
    Ji, Yimu
    ADVANCES IN MULTIMEDIA, 2020, 2020
  • [42] Large-Scale Phase Retrieval via Stochastic Reweighted Amplitude Flow
    Xiao, Zhuolei
    Zhang, Yerong
    Yang, Jie
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2020, 14 (11): : 4355 - 4371
  • [43] Spectral embedding-based multiview features fusion for content-based image retrieval
    Feng, Lin
    Yu, Laihang
    Zhu, Hai
    JOURNAL OF ELECTRONIC IMAGING, 2017, 26 (05)
  • [44] Concept-based and embedding-based models in lifelog retrieval: an empirical comparison of performance
    Nguyen, Manh-Duy
    Nguyen, Binh T.
    Gurrin, Cathal
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2025, 14 (02)
  • [45] Improving Large-Scale Image Retrieval Through Robust Aggregation of Local Descriptors
    Husain, Syed Sameed
    Bober, Miroslaw
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (09) : 1783 - 1796
  • [46] Improving Embedding-based Unsupervised Keyphrase Extraction by Incorporating Structural Information
    Song, Mingyang
    Liu, Huafeng
    Feng, Yi
    Jing, Liping
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1041 - 1048
  • [47] Learning Multilevel Semantic Similarity for Large-Scale Multi-Label Image Retrieval
    Song, Ge
    Tan, Xiaoyang
    ICMR '18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2018, : 64 - 72
  • [48] Leveraging Active Perception for Improving Embedding-based Deep Face Recognition
    Passalis, Nikolaos
    Tefas, Anastasios
    2020 IEEE 22ND INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2020,
  • [49] Large-Scale Image Retrieval Based on Compressed Camera Identification
    Valsesia, Diego
    Coluccia, Giulio
    Bianchi, Tiziano
    Magli, Enrico
    IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (09) : 1439 - 1449
  • [50] Large-scale Entity Alignment via Knowledge Graph Merging, Partitioning and Embedding
    Xin, Kexuan
    Sun, Zequn
    Hua, Wen
    Hu, Wei
    Qu, Jianfeng
    Zhou, Xiaofang
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 2240 - 2249