A data-driven approach to choosing privacy parameters for clinical trial data sharing under differential privacy

被引:0
|
作者
Chen, Henian [1 ,5 ]
Pang, Jinyong [1 ]
Zhao, Yayi [1 ]
Giddens, Spencer [2 ]
Ficek, Joseph [3 ]
Valente, Matthew J. [1 ]
Cao, Biwei [1 ]
Daley, Ellen [4 ]
机构
[1] Univ S Florida, Coll Publ Hlth, Study Design & Data Anal, Tampa, FL 33612 USA
[2] Univ Notre Dame, Dept Appl & Computat Math & Stat, Notre Dame, IN 46556 USA
[3] GlaxoSmithKline, Oncol Stat, Collegeville, PA 19426 USA
[4] Univ S Florida, Coll Publ Hlth, Lawton & Rhea Chiles Ctr Children & Families, Tampa, FL USA
[5] Univ S Florida, Coll Publ Hlth, Study Design & Data Anal, 13201 Bruce B Downs Blvd, MDC 56, Tampa, FL 33612 USA
关键词
clinical trial; differential privacy; accuracy; data sharing; privacy parameter; RELATION EXTRACTION;
D O I
10.1093/jamia/ocae038
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objectives Clinical trial data sharing is crucial for promoting transparency and collaborative efforts in medical research. Differential privacy (DP) is a formal statistical technique for anonymizing shared data that balances privacy of individual records and accuracy of replicated results through a "privacy budget" parameter, epsilon. DP is considered the state of the art in privacy-protected data publication and is underutilized in clinical trial data sharing. This study is focused on identifying epsilon values for the sharing of clinical trial data. Materials and Methods We analyzed 2 clinical trial datasets with privacy budget epsilon ranging from 0.01 to 10. Smaller values of epsilon entail adding greater amounts of random noise, with better privacy as a result. Comparison of rates, odds ratios, means, and mean differences between the original clinical trial datasets and the empirical distribution of the DP estimator was performed. Results The DP rate closely approximated the original rate of 6.5% when epsilon > 1. The DP odds ratio closely aligned with the original odds ratio of 0.689 when epsilon >= 3. The DP mean closely approximated the original mean of 164.64 when epsilon >= 1. As epsilon increased to 5, both the minimum and maximum DP means converged toward the original mean. Discussion There is no consensus on how to choose the privacy budget epsilon. The definition of DP does not specify the required level of privacy, and there is no established formula for determining epsilon. Conclusion Our findings suggest that the application of DP holds promise in the context of sharing clinical trial data.
引用
收藏
页码:1135 / 1143
页数:9
相关论文
共 50 条
  • [1] Generative Adversarial Privacy: A Data-Driven Approach to Information-Theoretic Privacy
    Huang, Chong
    Kairouz, Peter
    Sankar, Lalitha
    2018 CONFERENCE RECORD OF 52ND ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2018, : 2162 - 2166
  • [2] A Data-Driven Approach to Designing for Privacy in Household IoT
    He, Yangyang
    Bahirat, Paritosh
    Knijnenburg, Bart P.
    Menon, Abhilash
    ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2020, 10 (01)
  • [3] Genomic Data Sharing under Dependent Local Differential Privacy
    Yilmaz, Emre
    Ji, Tianxi
    Ayday, Erman
    Li, Pan
    CODASPY'22: PROCEEDINGS OF THE TWELVETH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY, 2022, : 77 - 88
  • [4] Data-driven Privacy With Domain Regularization
    Wang, Chong Xiao
    Tay, Wee Peng
    2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
  • [5] Differential Privacy for Clinical Trial Data: Preliminary Evaluations
    Vu, Duy
    Slavkovic, Aleksandra
    2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 138 - 143
  • [6] Privacy Dashboards: Reconciling data-driven business models and privacy
    Zimmermann, Christian
    Accorsi, Rafael
    Mueller, Guenter
    2014 NINTH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY (ARES), 2015, : 152 - 157
  • [7] Data-Driven Spectrum Trading with Secondary Users' Differential Privacy Preservation
    Wang, Jingyi
    Zhang, Xinyue
    Zhang, Qixun
    Li, Ming
    Guo, Yuanxiong
    Feng, Zhiyong
    Pan, Miao
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2021, 18 (01) : 438 - 447
  • [8] Research on Governmental Data Sharing Based on Local Differential Privacy Approach
    Liu, Liping
    Piao, Chunhui
    Jiang, Xuehong
    Zheng, Lijuan
    2018 IEEE 15TH INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING (ICEBE 2018), 2018, : 39 - 45
  • [9] Protecting your privacy in a data-driven world
    Stein, Stefan
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2022, 185 : S763 - S764
  • [10] Privacy and Monopoly Concerns in Data-Driven Transactions
    Aksit Karacam, Duygu
    LEGAL KNOWLEDGE AND INFORMATION SYSTEMS (JURIX 2019), 2019, 322 : 145 - 150