UNIC: Universal Classification Models via Multi-teacher Distillation

被引:0
|
作者
Sariyildiz, Mert Bulent [1 ]
Weinzaepfel, Philippe [1 ]
Lucas, Thomas [1 ]
Larlus, Diane [1 ]
Kalantidis, Yannis [1 ]
机构
[1] NAVER LABS Europe, Meylan, France
来源
关键词
Multi-Teacher Distillation; Classification; Generalization; KNOWLEDGE DISTILLATION; ENSEMBLE;
D O I
10.1007/978-3-031-73235-5_20
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pretrained models have become a commodity and offer strong results on a broad range of tasks. In this work, we focus on classification and seek to learn a unique encoder able to take from several complementary pretrained models. We aim at even stronger generalization across a variety of classification tasks. We propose to learn such an encoder via multi-teacher distillation. We first thoroughly analyze standard distillation when driven by multiple strong teachers with complementary strengths. Guided by this analysis, we gradually propose improvements to the basic distillation setup. Among those, we enrich the architecture of the encoder with a ladder of expendable projectors, which increases the impact of intermediate features during distillation, and we introduce teacher dropping, a regularization mechanism that better balances the teachers' influence. Our final distillation strategy leads to student models of the same capacity as any of the teachers, while retaining or improving upon the performance of the best teacher for each task.
引用
收藏
页码:353 / 371
页数:19
相关论文
共 50 条
  • [31] BadCleaner: Defending Backdoor Attacks in Federated Learning via Attention-Based Multi-Teacher Distillation
    Zhang, Jiale
    Zhu, Chengcheng
    Ge, Chunpeng
    Ma, Chuan
    Zhao, Yanchao
    Sun, Xiaobing
    Chen, Bing
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2024, 21 (05) : 4559 - 4573
  • [32] Learning Semantic Textual Similarity via Multi-Teacher Knowledge Distillation: A Multiple Data Augmentation method
    Lu, Zhikun
    Zhao, Ying
    Li, Jinnan
    Tian, Yuan
    2024 9TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS, ICCCS 2024, 2024, : 1197 - 1203
  • [33] Zero-Shot Cross-Lingual Named Entity Recognition via Progressive Multi-Teacher Distillation
    Li, Zhuoran
    Hu, Chunming
    Zhang, Richong
    Chen, Junfan
    Guo, Xiaohui
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4617 - 4630
  • [34] CIMTD: Class Incremental Multi-Teacher Knowledge Distillation for Fractal Object Detection
    Wu, Chuhan
    Luo, Xiaochuan
    Huang, Haoran
    Zhang, Yulin
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XII, 2025, 15042 : 51 - 65
  • [35] A Multi-Teacher Assisted Knowledge Distillation Approach for Enhanced Face Image Authentication
    Cheng, Tiancong
    Zhang, Ying
    Yin, Yifang
    Zimmermann, Roger
    Yu, Zhiwen
    Guo, Bin
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 135 - 143
  • [36] DE-MKD: Decoupled Multi-Teacher Knowledge Distillation Based on Entropy
    Cheng, Xin
    Zhang, Zhiqiang
    Weng, Wei
    Yu, Wenxin
    Zhou, Jinjia
    MATHEMATICS, 2024, 12 (11)
  • [37] Enhancing BERT Performance: Multi-teacher Adversarial Distillation with Clean and Robust Guidance
    Wu, Xunjin
    Chang, Jingfei
    Cheng, Wen
    Wu, Yunxiang
    Li, Yong
    Zeng, Lingfang
    CONCEPTUAL MODELING, ER 2024, 2025, 15238 : 3 - 17
  • [38] FM-LiteLearn: A Lightweight Brain Tumor Classification Framework Integrating Image Fusion and Multi-teacher Distillation Strategies
    Tan, Shengbo
    Cai, Ying
    Zhao, Yang
    Hu, Junjie
    Chen, Yuanyuan
    He, Chenxi
    ARTIFICIAL INTELLIGENCE IN HEALTHCARE, PT II, AIIH 2024, 2024, 14976 : 89 - 103
  • [39] Multi-teacher knowledge distillation based on joint Guidance of Probe and Adaptive Corrector
    Shang, Ronghua
    Li, Wenzheng
    Zhu, Songling
    Jiao, Licheng
    Li, Yangyang
    NEURAL NETWORKS, 2023, 164 : 345 - 356
  • [40] Device adaptation free-KDA based on multi-teacher knowledge distillation
    Yang, Yafang
    Guo, Bin
    Liang, Yunji
    Zhao, Kaixing
    Yu, Zhiwen
    Journal of Ambient Intelligence and Humanized Computing, 2024, 15 (10) : 3603 - 3615