Federated Machine Learning, Privacy-Enhancing Technologies, and Data Protection Laws in Medical Research: Scoping Review

被引:22
|
作者
Brauneck, Alissa [1 ]
Schmalhorst, Louisa [1 ]
Majdabadi, Mohammad Mahdi Kazemi [2 ]
Bakhtiari, Mohammad [2 ]
Voelker, Uwe [3 ]
Baumbach, Jan [2 ,4 ]
Baumbach, Linda [5 ]
Buchholtz, Gabriele [1 ]
机构
[1] Univ Hamburg, Hamburg Univ Fac Law, Hamburg, Germany
[2] Univ Hamburg, Inst Computat Syst Biol, Hamburg, Germany
[3] Univ Med Greifswald, Interfac Inst Genet & Funct Genom, Dept Funct Genom, Greifswald, Germany
[4] Univ Southern Denmark, Computat Biomed lab, Odense, Denmark
[5] Univ Med Ctr Hamburg Eppendorf, Dept Hlth Econ & Hlth Serv Res, Hamburg, Germany
关键词
federated learning; data protection regulation; data protection by design; privacy protection; General Data Protection Regulation compliance; GDPR compliance; privacy-preserving technologies; differential privacy; secure multiparty computation;
D O I
10.2196/41588
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background: The collection, storage, and analysis of large data sets are relevant in many sectors. Especially in the medical field, the processing of patient data promises great progress in personalized health care. However, it is strictly regulated, such as by the General Data Protection Regulation (GDPR). These regulations mandate strict data security and data protection and, thus, create major challenges for collecting and using large data sets. Technologies such as federated learning (FL), especially paired with differential privacy (DP) and secure multiparty computation (SMPC), aim to solve these challenges. Objective: This scoping review aimed to summarize the current discussion on the legal questions and concerns related to FL systems in medical research. We were particularly interested in whether and to what extent FL applications and training processes are compliant with the GDPR data protection law and whether the use of the aforementioned privacy-enhancing technologies (DP and SMPC) affects this legal compliance. We placed special emphasis on the consequences for medical research and development. Methods: We performed a scoping review according to the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews). We reviewed articles on Beck-Online, SSRN, ScienceDirect, arXiv, and Google Scholar published in German or English between 2016 and 2022. We examined 4 questions: whether local and global models are "personal data" as per the GDPR; what the "roles" as defined by the GDPR of various parties in FL are; who controls the data at various stages of the training process; and how, if at all, the use of privacy-enhancing technologies affects these findings. Results: We identified and summarized the findings of 56 relevant publications on FL. Local and likely also global models constitute personal data according to the GDPR. FL strengthens data protection but is still vulnerable to a number of attacks and the possibility of data leakage. These concerns can be successfully addressed through the privacy-enhancing technologies SMPC Conclusions: Combining FL with SMPC and DP is necessary to fulfill the legal data protection requirements (GDPR) in medical research dealing with personal data. Even though some technical and legal challenges remain, for example, the possibility of successful attacks on the system, combining FL with SMPC and DP creates enough security to satisfy the legal requirements of the GDPR. This combination thereby provides an attractive technical solution for health institutions willing to collaborate without exposing their data to risk. From a legal perspective, the combination provides enough built-in security measures to satisfy data protection requirements, and from a technical perspective, the combination provides secure systems with comparable performance with centralized machine learning applications.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Federated and distributed learning applications for electronic health records and structured medical data: a scoping review
    Li, Siqi
    Liu, Pinyan
    Nascimento, Gustavo G.
    Wang, Xinru
    Leite, Fabio Renato Manzolli
    Chakraborty, Bibhas
    Hong, Chuan
    Ning, Yilin
    Xie, Feng
    Teo, Zhen Ling
    Ting, Daniel Shu Wei
    Haddadi, Hamed
    Ong, Marcus Eng Hock
    Peres, Marco Aurelio
    Liu, Nan
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2023, 30 (12) : 2041 - 2049
  • [42] Federated learning for preserving data privacy in collaborative healthcare research
    Loftus, Tyler J.
    Ruppert, Matthew M.
    Shickel, Benjamin
    Ozrazgat-Baslanti, Tezcan
    Balch, Jeremy A.
    Efron, Philip A.
    Upchurch, Gilbert R.
    Rashidi, Parisa
    Tignanelli, Christopher
    Bian, Jiang
    Bihorac, Azra
    DIGITAL HEALTH, 2022, 8
  • [43] A scoping review of machine learning in psychotherapy research
    Aafjes-van Doorn, Katie
    Kamsteeg, Celine
    Bate, Jordan
    Aafjes, Marc
    PSYCHOTHERAPY RESEARCH, 2021, 31 (01) : 92 - 116
  • [44] Agricultural data privacy and federated learning: A review of challenges and opportunities
    Dembani, Rahool
    Karvelas, Ioannis
    Akbar, Nur Arifin
    Rizou, Stamatia
    Tegolo, Domenico
    Fountas, Spyros
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2025, 232
  • [45] Privacy-Enhancing Technologies for Chronic Disease Data: User Experience in Developing a Secured Virtual Data Center Workstation Environment
    Alexander, Susan
    Lindley, Lisa C.
    Svynarenko, Radion
    CIN-COMPUTERS INFORMATICS NURSING, 2023, 41 (10) : 739 - 742
  • [46] A scoping review of privacy and utility metrics in medical synthetic data
    Kaabachi, Bayrem
    Despraz, Jeremie
    Meurers, Thierry
    Otte, Karen
    Halilovic, Mehmed
    Kulynych, Bogdan
    Prasser, Fabian
    Raisaro, Jean Louis
    NPJ DIGITAL MEDICINE, 2025, 8 (01):
  • [47] Enhancing Privacy Protection for Online Learning Resource Recommendation with Machine Unlearning
    Li, Wenqin
    Zheng, Xinrong
    Huang, Ruihong
    Lin, Mingwei
    Shen, Jun
    Lin, Jiayin
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 3282 - 3287
  • [48] Clustering-Based Federated Learning for Enhancing Data Privacy in Internet of Vehicles
    Jin, Zilong
    Wang, Jin
    Zhang, Lejun
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2024, 18 (06): : 1462 - 1477
  • [49] Enhancing correlated big data privacy using differential privacy and machine learning
    Biswas, Sreemoyee
    Fole, Anuja
    Khare, Nilay
    Agrawal, Pragati
    JOURNAL OF BIG DATA, 2023, 10 (01)
  • [50] Enhancing correlated big data privacy using differential privacy and machine learning
    Sreemoyee Biswas
    Anuja Fole
    Nilay Khare
    Pragati Agrawal
    Journal of Big Data, 10