A Novel Imputation Approach for Sharing Protected Public Health Data

被引:5
|
作者
Erdman, Elizabeth A. [1 ]
Young, Leonard D. [2 ]
Bernson, Dana L. [1 ]
Bauer, Cici [4 ]
Chui, Kenneth [3 ]
Stopka, Thomas J. [5 ,6 ]
机构
[1] Commonwealth Massachusetts, Off Populat Hlth, Dept Publ Hlth, Boston, MA USA
[2] Commonwealth Massachusetts, Bur Hlth Profess Licensure, Dept Publ Hlth, Boston, MA USA
[3] Tufts Univ, Dept Publ Hlth & Community Med, Boston, MA USA
[4] Univ Texas Hlth Sci Ctr Houston, Dept Biostat & Data Sci, Houston, TX USA
[5] Tufts Univ, Tufts Clin & Translat Sci Inst, Medford, MA USA
[6] Tufts Univ, Dept Publ Hlth & Community Med, Medford, MA USA
关键词
MISSING DATA;
D O I
10.2105/AJPH.2021.306432
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Objectives. To develop an imputation method to produce estimates for suppressed values within a shared government administrative data set to facilitate accurate data sharing and statistical and spatial analyses. Methods. We developed an imputation approach that incorporated known features of suppressed Massachusetts surveillance data from 2011 to 2017 to predict missing values more precisely. Our methods for 35 de-identified opioid prescription data sets combined modified previous or next substitution followed by mean imputation and a count adjustment to estimate suppressed values before sharing. We modeled 4 methods and compared the results to baseline mean imputation. Results. We assessed performance by comparing root mean squared error (RMSE), mean absolute error (MAE), and proportional variance between imputed and suppressed values. Our method outperformed mean imputation; we retained 46% of the suppressed value's proportional variance with better precision (22% lower RMSE and 26% lower MAE) than simple mean imputation. Conclusions. Our easy-to-implement imputation technique largely overcomes the adverse effects of low count value suppression with superior results to simple mean imputation. This novel method is generalizable to researchers sharing protected public health surveillance data.
引用
收藏
页码:1830 / 1838
页数:9
相关论文
共 50 条
  • [31] DATA IMPUTATION: AN OPTIMIZATION APPROACH.
    Cooley, Philip C.
    International Journal on Policy and Information, 1987, 11 (01): : 39 - 45
  • [32] Health and public sector data sharing requires social licence negotiations
    Stephenson, Niamh
    Smith, Catherine
    Vajdic, Claire M.
    AUSTRALIAN AND NEW ZEALAND JOURNAL OF PUBLIC HEALTH, 2022, 46 (04) : 426 - 428
  • [33] Disease surveillance data sharing for public health: the next ethical frontiers
    Kostkova, Patty
    LIFE SCIENCES SOCIETY AND POLICY, 2018, 14
  • [34] Research data sharing during the Zika virus public health emergency
    Jorge, Vanessa de Arruda
    Albagli, Sarita
    INFORMATION RESEARCH-AN INTERNATIONAL ELECTRONIC JOURNAL, 2020, 25 (01):
  • [35] Strengthening Global Public Health Surveillance through Data and Benefit Sharing
    Edelstein, Michael
    Lee, Lisa M.
    Herten-Crabb, Asha
    Heymann, David L.
    Harper, David R.
    EMERGING INFECTIOUS DISEASES, 2018, 24 (07) : 1324 - 1330
  • [36] Factors Related to Public Health Data Sharing between Local and State Health Departments
    Vest, Joshua R.
    Issel, L. Michele
    HEALTH SERVICES RESEARCH, 2014, 49 (01) : 373 - 391
  • [37] Imputation of trip data for a docked bike-sharing system
    Thomas, Milan Mathew
    Vernia, Ashish
    Mayakuntla, Sai Kiran
    CURRENT SCIENCE, 2022, 122 (03): : 310 - 318
  • [38] Sharing the business of public health
    Sim, F.
    Mackie, P.
    PUBLIC HEALTH, 2007, 121 (06) : 399 - 400
  • [39] Sharing Data for Public Security
    Bezzi, Michele
    Montagnon, Gilles
    Salzgeber, Vincent
    Trabelsi, Slim
    PRIVACY AND IDENTITY MANAGEMENT FOR LIFE, 2010, 320 : 188 - 197
  • [40] Data Sharing for the Public Good
    Helzlsouer, Kathy J.
    Reedy, Jill
    JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2020, 112 (09): : 867 - 868