UNIC: Universal Classification Models via Multi-teacher Distillation

被引:0
|
作者
Sariyildiz, Mert Bulent [1 ]
Weinzaepfel, Philippe [1 ]
Lucas, Thomas [1 ]
Larlus, Diane [1 ]
Kalantidis, Yannis [1 ]
机构
[1] NAVER LABS Europe, Meylan, France
来源
关键词
Multi-Teacher Distillation; Classification; Generalization; KNOWLEDGE DISTILLATION; ENSEMBLE;
D O I
10.1007/978-3-031-73235-5_20
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pretrained models have become a commodity and offer strong results on a broad range of tasks. In this work, we focus on classification and seek to learn a unique encoder able to take from several complementary pretrained models. We aim at even stronger generalization across a variety of classification tasks. We propose to learn such an encoder via multi-teacher distillation. We first thoroughly analyze standard distillation when driven by multiple strong teachers with complementary strengths. Guided by this analysis, we gradually propose improvements to the basic distillation setup. Among those, we enrich the architecture of the encoder with a ladder of expendable projectors, which increases the impact of intermediate features during distillation, and we introduce teacher dropping, a regularization mechanism that better balances the teachers' influence. Our final distillation strategy leads to student models of the same capacity as any of the teachers, while retaining or improving upon the performance of the best teacher for each task.
引用
收藏
页码:353 / 371
页数:19
相关论文
共 50 条
  • [21] Improving Bi-encoder Document Ranking Models with Two Rankers and Multi-teacher Distillation
    Choi, Jaekeol
    Jung, Euna
    Suh, Jangwon
    Rhee, Wonjong
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 2192 - 2196
  • [22] MTKD: Multi-Teacher Knowledge Distillation for Image Super-Resolution
    Jiang, Yuxuan
    Feng, Chen
    Zhang, Fan
    Bull, David
    COMPUTER VISION - ECCV 2024, PT XXXIX, 2025, 15097 : 364 - 382
  • [23] Semi-supervised lung adenocarcinoma histopathology image classification based on multi-teacher knowledge distillation
    Wang, Qixuan
    Zhang, Yanjun
    Lu, Jun
    Li, Congsheng
    Zhang, Yungang
    PHYSICS IN MEDICINE AND BIOLOGY, 2024, 69 (18):
  • [24] Let All Be Whitened: Multi-Teacher Distillation for Efficient Visual Retrieval
    Ma, Zhe
    Dong, Jianfeng
    Ji, Shouling
    Liu, Zhenguang
    Zhang, Xuhong
    Wang, Zonghui
    He, Sifeng
    Qian, Feng
    Zhang, Xiaobo
    Yang, Lei
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5, 2024, : 4126 - 4135
  • [25] Mitigating Accuracy-Robustness Trade-Off via Balanced Multi-Teacher Adversarial Distillation
    Zhao, Shiji
    Wang, Xizhe
    Wei, Xingxing
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 9338 - 9352
  • [26] MTKDSR: Multi-Teacher Knowledge Distillation for Super Resolution Image Reconstruction
    Yao, Gengqi
    Li, Zhan
    Bhanu, Bir
    Kang, Zhiqing
    Zhong, Ziyi
    Zhang, Qingfeng
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 352 - 358
  • [27] Adversarial Multi-Teacher Distillation for Semi-Supervised Relation Extraction
    Li, Wanli
    Qian, Tieyun
    Li, Xuhui
    Zou, Lixin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (08) : 11291 - 11301
  • [28] Data-Free Low-Bit Quantization via Dynamic Multi-teacher Knowledge Distillation
    Huang, Chong
    Lin, Shaohui
    Zhang, Yan
    Li, Ke
    Zhang, Baochang
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII, 2024, 14432 : 28 - 41
  • [29] Unsupervised Domain Adaptation in Medical Image Segmentation via Fourier Feature Decoupling and Multi-teacher Distillation
    Hu, Wei
    Xu, Qiaozhi
    Qi, Xuanhao
    Yin, Yanjun
    Zhi, Min
    Lian, Zhe
    Yang, Na
    Duan, Wentao
    Yu, Lei
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT VI, ICIC 2024, 2024, 14867 : 98 - 110
  • [30] Accurate and efficient protein embedding using multi-teacher distillation learning
    Shang, Jiayu
    Peng, Cheng
    Ji, Yongxin
    Guan, Jiaojiao
    Cai, Dehan
    Tang, Xubo
    Sun, Yanni
    BIOINFORMATICS, 2024, 40 (09)