Analysis of Privacy Leakage in Federated Large Language Models

被引:0
|
作者
Vu, Minh N. [1 ]
Nguyen, Truc [1 ]
Jeter, Tre' R. [1 ]
Thai, My T. [1 ]
机构
[1] Univ Florida, Gainesville, FL 32611 USA
基金
美国国家科学基金会;
关键词
MEMBERSHIP INFERENCE ATTACKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rapid adoption of Federated Learning (FL) as the training and tuning protocol for applications utilizing Large Language Models (LLMs), recent research highlights the need for significant modifications to FL to accommodate the large-scale of LLMs. While substantial adjustments to the protocol have been introduced as a response, comprehensive privacy analysis for the adapted FL protocol is currently lacking. To address this gap, our work delves into an extensive examination of the privacy analysis of FL when used for training LLMs, both from theoretical and practical perspectives. In particular, we design two active membership inference attacks with guaranteed theoretical success rates to assess the privacy leakages of various adapted FL configurations. Our theoretical findings are translated into practical attacks, revealing substantial privacy vulnerabilities in popular LLMs, including BERT, RoBERTa, DistilBERT, and OpenAI's GPTs, across multiple real-world language datasets. Additionally, we conduct thorough experiments to evaluate the privacy leakage of these models when data is protected by state-of-the-art differential privacy (DP) mechanisms.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] ProPILE: Probing Privacy Leakage in Large Language Models
    Kim, Siwon
    Yun, Sangdoo
    Lee, Hwaran
    Gubri, Martin
    Yoon, Sungroh
    Oh, Seong Joon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [2] Privacy Leakage of Adversarial Training Models in Federated Learning Systems
    Zhang, Jingyang
    Chen, Yiran
    Li, Hai
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 107 - 113
  • [3] Federated and edge learning for large language models
    Piccialli, Francesco
    Chiaro, Diletta
    Qi, Pian
    Bellandi, Valerio
    Damiani, Ernesto
    INFORMATION FUSION, 2025, 117
  • [4] Large language models: a new approach for privacy policy analysis at scale
    Rodriguez, David
    Yang, Ian
    Del Alamo, Jose M.
    Sadeh, Norman
    COMPUTING, 2024, 106 (12) : 3879 - 3903
  • [5] Privacy issues in Large Language Models: A survey
    Kibriya, Hareem
    Khan, Wazir Zada
    Siddiqa, Ayesha
    Khan, Muhammad Khurrum
    COMPUTERS & ELECTRICAL ENGINEERING, 2024, 120
  • [6] Cybercrime and Privacy Threats of Large Language Models
    Kshetri, Nir
    IT PROFESSIONAL, 2023, 25 (03) : 9 - 13
  • [7] A QUANTITATIVE METRIC FOR PRIVACY LEAKAGE IN FEDERATED LEARNING
    Liu, Yong
    Zhu, Xinghua
    Wang, Jianzong
    Xiao, Jing
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3065 - 3069
  • [8] PFDP: privacy-preserving federated distillation method for pretrained language models
    Chen, Chaomeng
    Su, Sen
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025,
  • [9] Security and Privacy Challenges of Large Language Models: A Survey
    Das, Badhan chandra
    Amini, M. hadi
    Wu, Yanzhao
    ACM COMPUTING SURVEYS, 2025, 57 (06)
  • [10] TITANIC: Towards Production Federated Learning with Large Language Models
    Su, Ningxin
    Hu, Chenghao
    Li, Baochun
    Li, Bo
    IEEE INFOCOM 2024-IEEE CONFERENCE ON COMPUTER COMMUNICATIONS, 2024, : 611 - 620