Fine -Grained Distillation for Long Document Retrieval

被引:0
|
作者
Zhou, Yucheng [1 ,4 ]
Shen, Tao [2 ]
Geng, Xiubo [3 ]
Tao, Chongyang [3 ]
Shen, Jianbing [1 ]
Long, Guodong [2 ]
Xu, Can [3 ]
Jiang, Daxin [3 ]
机构
[1] Univ Macau, CIS, SKL IOTSC, Taipa, Macau, Peoples R China
[2] Univ Technol Sydney, AAII, FEIT, Sydney, NSW, Australia
[3] Microsoft Corp, Redmond, WA 98052 USA
[4] Microsoft, Redmond, WA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Long document retrieval aims to fetch query -relevant documents from a large-scale collection, where knowledge distillation has become de facto to improve a retriever by mimicking a heterogeneous yet powerful cross -encoder. However, in contrast to passages or sentences, retrieval on long documents suffers from the scope hypothesis that a long document may cover multiple topics. This maximizes their structure heterogeneity and poses a granular-mismatch issue, leading to an inferior distillation efficacy. In this work, we propose a new learning framework, fine-grained distillation (FGD), for long -document retrievers. While preserving the conventional dense retrieval paradigm, it first produces global -consistent representations crossing different fine granularity and then applies multi-granular aligned distillation merely during training. In experiments, we evaluate our framework on two long document retrieval benchmarks, which show state-of-the-art performance.
引用
收藏
页码:19732 / 19740
页数:9
相关论文
共 50 条
  • [21] Fine-Grained Encrypted Image Retrieval in Cloud Environment
    Chen, Yi-Hui
    Huang, Min-Chun
    Liu, Lingfeng
    MATHEMATICS, 2024, 12 (01)
  • [22] Cross modal recipe retrieval with fine grained modal interaction
    Zhao, Fan
    Lu, Yuqing
    Yao, Zhuo
    Qu, Fangying
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [23] One-Shot Fine-Grained Instance Retrieval
    Yao, Hantao
    Zhang, Shiliang
    Zhang, Yongdong
    Li, Jintao
    Tian, Qi
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 342 - 350
  • [24] Fine-Grained Image Retrieval via Object Localization
    Wang, Rong
    Zou, Wei
    Wang, Jiajun
    ELECTRONICS, 2023, 12 (10)
  • [25] A method for fine-grained document alignment using structural information
    Tsujio, Naoki
    Shimizu, Toshiyuki
    Yoshikawa, Masatoshi
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8709 LNCS : 201 - 211
  • [26] FAA: Fine-grained Attention Alignment for Cascade Document Ranking
    Li, Zhen
    Tao, Chongyang
    Feng, Jiazhan
    Shen, Tao
    Zhao, Dongyan
    Geng, Xiubo
    Jiang, Daxin
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1688 - 1700
  • [27] A Method for Fine-Grained Document Alignment Using Structural Information
    Tsujio, Naoki
    Shimizu, Toshiyuki
    Yoshikawa, Masatoshi
    WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014, 2014, 8709 : 201 - 211
  • [28] Multiple-Level Distillation for Video Fine-Grained Accident Detection
    Yu, Hongyang
    Zhang, Xinfeng
    Wang, Yaowei
    Huang, Qingming
    Yin, Baocai
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (06) : 4445 - 4457
  • [29] CEKD:Cross ensemble knowledge distillation for augmented fine-grained data
    Zhang, Ke
    Fan, Jin
    Huang, Shaoli
    Qiao, Yongliang
    Yu, Xiaofeng
    Qin, Feiwei
    APPLIED INTELLIGENCE, 2022, 52 (14) : 16640 - 16650
  • [30] CNN-Transformer with Stepped Distillation for Fine-Grained Visual Classification
    Xu, Qin
    Liu, Peng
    Wang, Jiahui
    Huang, Lili
    Tang, Jin
    PATTERN RECOGNITION AND COMPUTER VISION, PT IX, PRCV 2024, 2025, 15039 : 364 - 377