Vertical federated learning based on data subset representation for healthcare application

被引:0
|
作者
Shi, Yukun [1 ]
Zhang, Jilin [1 ]
Xue, Meiting [1 ]
Zeng, Yan [2 ]
Jia, Gangyong [2 ]
Yu, Qihong [2 ]
Li, Miaoqi [2 ]
机构
[1] Hangzhou Dianzi Univ, Sch Cyberspace, Hangzhou 310018, Peoples R China
[2] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Hangzhou 310018, Peoples R China
基金
中国国家自然科学基金;
关键词
Vertical federated learning; Latent feature representation; Smart healthcare; Privacy preservation;
D O I
10.1016/j.cmpb.2025.108623
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Background and Objective : Artificial intelligence is increasingly essential for disease classification and clinical diagnosis tasks in healthcare. Given the strict privacy needs of healthcare data, Vertical Federated Learning (VFL) has been introduced. VFL allows multiple hospitals to collaboratively train models on vertically partitioned data, where each holds only the patient's partial data features, thus maintaining patient confidentiality. However, VFL applications in healthcare scenarios with fewer samples and labels are challenging because existing methods heavily depend on labeled samples and do not consider the intrinsic connections among the data across hospitals. Methods : This paper proposes FedRL, a representation-based VFL method that enhances the performance of downstream tasks by utilizing aligned data for federated representation pretraining. The proposed method creates the same feature dimensions subsets by splitting the local data, exploiting the relationships among these subsets, constructing a bespoke loss function, and collaboratively training a representation model to these subsets across all participating hospitals. This model captures the latent representations of the global data, which are then applied to the downstream classification tasks. Results and Conclusion : The proposed FedRL method was validated through experiments on three healthcare datasets. The results demonstrate that the proposed method outperforms several existing methods across three performance metrics. Specifically, FedRL achieves average improvements of 4.7%, 5.6%, and 4.8% in accuracy, AUC, and F1-score, respectively, compared to current methods. In addition, FedRL demonstrates greater robustness and consistent performance in scenarios with limited labeled samples, thereby confirming its effectiveness and potential use in healthcare data analysis.
引用
收藏
页数:11
相关论文
共 50 条
  • [11] CAFE: Catastrophic Data Leakage in Vertical Federated Learning
    Jin, Xiao
    Chen, Pin-Yu
    Hsu, Chia-Yi
    Yu, Chia-Mu
    Chen, Tianyi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [12] Distributed and deep vertical federated learning with big data
    Liu, Ji
    Zhou, Xuehai
    Mo, Lei
    Ji, Shilei
    Liao, Yuan
    Li, Zheng
    Gu, Qin
    Dou, Dejing
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (21):
  • [13] Knowledge abstraction and filtering based federated learning over heterogeneous data views in healthcare
    Thakur, Anshul
    Molaei, Soheila
    Nganjimi, Pafue Christy
    Soltan, Andrew
    Schwab, Patrick
    Branson, Kim
    Clifton, David A.
    NPJ DIGITAL MEDICINE, 2024, 7 (01):
  • [14] Full Data-Processing Power Load Forecasting Based on Vertical Federated Learning
    Mao, Zhengxiong
    Li, Hui
    Huang, Zuyuan
    Tian, Yuan
    Zhao, Peng
    Li, Yanan
    JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, 2023, 2023
  • [15] A blockchain-based federated learning mechanism for privacy preservation of healthcare IoT data
    Moulahi, Wided
    Jdey, Imen
    Moulahi, Tarek
    Alawida, Moatsum
    Alabdulatif, Abdulatif
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 167
  • [16] Vertical Federated Learning Based on Consortium Blockchain for Data Sharing in Mobile Edge Computing
    Zhang, Yonghao
    Wu, Yongtang
    Li, Tao
    Zhou, Hui
    Chen, Yuling
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2023, 137 (01): : 345 - 361
  • [17] Federated Learning in Big Data Application and Sharing
    Yang Jing
    Zhang Quan
    Liu Kunpeng
    Jin Peng
    Zhao Guoyi
    FUZZY SYSTEMS AND DATA MINING VI, 2020, 331 : 423 - 435
  • [18] A Mammography Data Management Application for Federated Learning
    Tkachenko, Dmytro
    Mazur-Milecka, Magdalena
    2024 16TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION, HSI 2024, 2024,
  • [19] Vulnerabilities of Data Protection in Vertical Federated Learning Training and Countermeasures
    Zhu, Derui
    Chen, Jinfu
    Zhou, Xuebing
    Shang, Weiyi
    Hassan, Ahmed E.
    Grossklags, Jens
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 3674 - 3689
  • [20] Federated Representation Learning With Data Heterogeneity for Human Mobility Prediction
    Zhang, Xiao
    Wang, Qilin
    Ye, Ziming
    Ying, Haochao
    Yu, Dongxiao
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (06) : 6111 - 6122