Stealing Knowledge from Pre-trained Language Models for Federated Classifier Debiasing

被引：1

作者：

Zhu, Meilu ^{[1
]}

Yang, Qiushi ^{[2
]}

Gao, Zhifan ^{[3
]}

Liu, Jun ^{[1
]}

Yuan, Yixuan ^{[4
]}

机构：

[1] City Univ Hong Kong, Dept Mech Engn, Hong Kong, Peoples R China

[2] City Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China

[3] Sun Yat Sen Univ, Sch Biomed Engn, Guangzhou, Peoples R China

[4] Chinese Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China

来源：

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT X | 2024年 / 15010卷

关键词：

Federated learning; Medical Image Classification; Pre-trained Language Model;

D O I：

10.1007/978-3-031-72117-5_64

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Federated learning (FL) has shown great potential in medical image computing since it provides a decentralized learning paradigm that allows multiple clients to train a model collaboratively without privacy leakage. However, current studies have shown that heterogeneous data of clients causes biased classifiers of local models during training, leading to the performance degradation of a federation system. In experiments, we surprisingly found that continuously freezing local classifiers can significantly improve the performance of the baseline FL method (FedAvg) for heterogeneous data. This observation motivates us to pre-construct a high-quality initial classifier for local models and freeze it during local training to avoid classifier biases. With this insight, we propose a novel approach named Federated Classifier deBiasing (FedCB) to solve the classifier biases problem in heterogeneous federated learning. The core idea behind FedCB is to exploit linguistic knowledge from pre-trained language models (PLMs) to construct high-quality local classifiers. Specifically, FedCB first collects the class concepts from clients and then uses a set of prompts to contextualize them, yielding language descriptions of these concepts. These descriptions are fed into a pre-trained language model to obtain their text embeddings. The generated embeddings are sent to clients to estimate the distribution of each category in the semantic space. Regarding these distributions as the local classifiers, we perform the alignment between the image representations and the corresponding semantic distribution by minimizing an upper bound of the expected cross-entropy loss. Extensive experiments on public datasets demonstrate the superior performance FedCB compared to state-of-the-art methods. The source code is available at https://github.com/CUHK-AIM-Group/FedCB.

引用

页码：685 / 695

页数：11

共 50 条

[31] Preserving Commonsense Knowledge from Pre-trained Language Models via Causal Inference
Zheng, Junhao
Ma, Qianli
Qiu, Shengjie
Wu, Yue
Ma, Peitian
Liu, Junlong
Feng, Huawen
Shang, Xichen
Chen, Haibin
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 9155 - 9173
[32] Annotating Columns with Pre-trained Language Models
Suhara, Yoshihiko
Li, Jinfeng
Li, Yuliang
Zhang, Dan
Demiralp, Cagatay
Chen, Chen
Tan, Wang-Chiew
PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 1493 - 1503
[33] Distilling Relation Embeddings from Pre-trained Language Models
Ushio, Asahi
Camacho-Collados, Jose
Schockaert, Steven
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9044 - 9062
[34] LaoPLM: Pre-trained Language Models for Lao
Lin, Nankai
Fu, Yingwen
Yang, Ziyu
Chen, Chuwei
Jiang, Shengyi
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6506 - 6512
[35] PhoBERT: Pre-trained language models for Vietnamese
Dat Quoc Nguyen
Anh Tuan Nguyen
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1037 - 1042
[36] Deciphering Stereotypes in Pre-Trained Language Models
Ma, Weicheng
Scheible, Henry
Wang, Brian
Veeramachaneni, Goutham
Chowdhary, Pratim
Sung, Alan
Koulogeorge, Andrew
Wang, Lili
Yang, Diyi
Vosoughi, Soroush
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 11328 - 11345
[37] HinPLMs: Pre-trained Language Models for Hindi
Huang, Xixuan
Lin, Nankai
Li, Kexin
Wang, Lianxi
Gan, Suifu
2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 241 - 246
[38] Evaluating Commonsense in Pre-Trained Language Models
Zhou, Xuhui
Zhang, Yue
Cui, Leyang
Huang, Dandan
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9733 - 9740
[39] DistillingWord Meaning in Context from Pre-trained Language Models
Arase, Yuki
Kajiwara, Tomoyuki
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 534 - 546
[40] Code Execution with Pre-trained Language Models
Liu, Chenxiao
Lu, Shuai
Chen, Weizhu
Jiang, Daxin
Svyatkovskiy, Alexey
Fu, Shengyu
Sundaresan, Neel
Duan, Nan
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 4984 - 4999

← 1 2 3 4 5 →