Stealing Knowledge from Pre-trained Language Models for Federated Classifier Debiasing

被引:1
|
作者
Zhu, Meilu [1 ]
Yang, Qiushi [2 ]
Gao, Zhifan [3 ]
Liu, Jun [1 ]
Yuan, Yixuan [4 ]
机构
[1] City Univ Hong Kong, Dept Mech Engn, Hong Kong, Peoples R China
[2] City Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China
[3] Sun Yat Sen Univ, Sch Biomed Engn, Guangzhou, Peoples R China
[4] Chinese Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China
关键词
Federated learning; Medical Image Classification; Pre-trained Language Model;
D O I
10.1007/978-3-031-72117-5_64
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Federated learning (FL) has shown great potential in medical image computing since it provides a decentralized learning paradigm that allows multiple clients to train a model collaboratively without privacy leakage. However, current studies have shown that heterogeneous data of clients causes biased classifiers of local models during training, leading to the performance degradation of a federation system. In experiments, we surprisingly found that continuously freezing local classifiers can significantly improve the performance of the baseline FL method (FedAvg) for heterogeneous data. This observation motivates us to pre-construct a high-quality initial classifier for local models and freeze it during local training to avoid classifier biases. With this insight, we propose a novel approach named Federated Classifier deBiasing (FedCB) to solve the classifier biases problem in heterogeneous federated learning. The core idea behind FedCB is to exploit linguistic knowledge from pre-trained language models (PLMs) to construct high-quality local classifiers. Specifically, FedCB first collects the class concepts from clients and then uses a set of prompts to contextualize them, yielding language descriptions of these concepts. These descriptions are fed into a pre-trained language model to obtain their text embeddings. The generated embeddings are sent to clients to estimate the distribution of each category in the semantic space. Regarding these distributions as the local classifiers, we perform the alignment between the image representations and the corresponding semantic distribution by minimizing an upper bound of the expected cross-entropy loss. Extensive experiments on public datasets demonstrate the superior performance FedCB compared to state-of-the-art methods. The source code is available at https://github.com/CUHK-AIM-Group/FedCB.
引用
收藏
页码:685 / 695
页数:11
相关论文
共 50 条
  • [21] On the Sentence Embeddings from Pre-trained Language Models
    Li, Bohan
    Zhou, Hao
    He, Junxian
    Wang, Mingxuan
    Yang, Yiming
    Li, Lei
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9119 - 9130
  • [22] Pre-trained language models with domain knowledge for biomedical extractive summarization
    Xie Q.
    Bishop J.A.
    Tiwari P.
    Ananiadou S.
    Knowledge-Based Systems, 2022, 252
  • [23] Commonsense Knowledge Reasoning and Generation with Pre-trained Language Models: A Survey
    Bhargava, Prajjwal
    Ng, Vincent
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 12317 - 12325
  • [24] Plug-and-Play Knowledge Injection for Pre-trained Language Models
    Zhang, Zhengyan
    Zeng, Zhiyuan
    Lin, Yankai
    Wang, Huadong
    Ye, Deming
    Xiao, Chaojun
    Han, Xu
    Liu, Zhiyuan
    Li, Peng
    Sun, Maosong
    Zhou, Jie
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 10641 - 10656
  • [25] Enhancing pre-trained language models with Chinese character morphological knowledge
    Zheng, Zhenzhong
    Wu, Xiaoming
    Liu, Xiangzhi
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (01)
  • [26] Gauging, enriching and applying geography knowledge in Pre-trained Language Models
    Ramrakhiyani, Nitin
    Varma, Vasudeva
    Palshikar, Girish Keshav
    Pawar, Sachin
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (01)
  • [27] From Cloze to Comprehension: Retrofitting Pre-trained Masked Language Models to Pre-trained Machine Reader
    Xu, Weiwen
    Li, Xin
    Zhang, Wenxuan
    Zhou, Meng
    Lam, Wai
    Si, Luo
    Bing, Lidong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [28] Knowledge Base Grounded Pre-trained Language Models via Distillation
    Sourty, Raphael
    Moreno, Jose G.
    Servant, Francois-Paul
    Tamine, Lynda
    39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, : 1617 - 1625
  • [29] Federated Learning from Pre-Trained Models: A Contrastive Learning Approach
    Tan, Yue
    Long, Guodong
    Ma, Jie
    Liu, Lu
    Zhou, Tianyi
    Jiang, Jing
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [30] Knowledge-Grounded Dialogue Generation with Pre-trained Language Models
    Zhao, Xueliang
    Wu, Wei
    Xu, Can
    Tao, Chongyang
    Zhao, Dongyan
    Yan, Rui
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 3377 - 3390