Graphical and numerical diagnostic tools to assess suitability of multiple imputations and imputation models

被引:32
作者
Bondarenko, Irina [1 ]
Raghunathan, Trivellore [2 ,3 ]
机构
[1] Univ Michigan, Sch Publ Hlth, Dept Biostat, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Inst Social Res, Survey Res Ctr, Ann Arbor, MI 48106 USA
[3] Univ Michigan, Sch Publ Hlth, Dept Biostat, Ann Arbor, MI 48106 USA
关键词
multiple imputation; propensity score; diagnostics; congeniality; PROPENSITY SCORE; BIAS;
D O I
10.1002/sim.6926
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Multiple imputation has become a popular approach for analyzing incomplete data. Many software packages are available to multiply impute the missing values and to analyze the resulting completed data sets. However, diagnostic tools to check the validity of the imputations are limited, and the majority of the currently available methods need considerable knowledge of the imputation model. In many practical settings, however, the imputer and the analyst may be different individuals or from different organizations, and the analyst model may or may not be congenial to the model used by the imputer. This article develops and evaluates a set of graphical and numerical diagnostic tools for two practical purposes: (i) for an analyst to determine whether the imputations are reasonable under his/her model assumptions without actually knowing the imputation model assumptions; and (ii) for an imputer to fine tune the imputation model by checking the key characteristics of the observed and imputed values. The tools are based on the numerical and graphical comparisons of the distributions of the observed and imputed values conditional on the propensity of response. The methodology is illustrated using simulated data sets created under a variety of scenarios. The examples focus on continuous and binary variables, but the principles can be used to extend methods for other types of variables. Copyright (c) 2016 John Wiley & Sons, Ltd.
引用
收藏
页码:3007 / 3020
页数:14
相关论文
共 26 条
[1]   Diagnostics for multivariate imputations [J].
Abayomi, Kobi ;
Gelman, Andrew ;
Levy, Marc .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2008, 57 :273-291
[2]   Multiple imputation by chained equations: what is it and how does it work? [J].
Azur, Melissa J. ;
Stuart, Elizabeth A. ;
Frangakis, Constantine ;
Leaf, Philip J. .
INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, 2011, 20 (01) :40-49
[3]   Small-sample degrees of freedom with multiple imputation [J].
Barnard, J ;
Rubin, DB .
BIOMETRIKA, 1999, 86 (04) :948-955
[4]  
Berglund P.Heeringa., 2014, Multiple imputation of missing data using SAS
[5]   Propensity score-based diagnostics for categorical response regression models [J].
Boonstra, Philip S. ;
Bondarenko, Irina ;
Park, Sung Kyun ;
Vokonas, Pantel S. ;
Mukherjee, Bhramar .
STATISTICS IN MEDICINE, 2014, 33 (03) :455-469
[6]  
COCHRAN WG, 1973, SANKHYA SER A, V35, P417
[7]   Estimating and using propensity scores with partially missing data [J].
D'Agostino, RB ;
Rubin, DB .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2000, 95 (451) :749-759
[8]   Multiple imputation in a large-scale complex survey: a practical guide [J].
He, Y. ;
Zaslavsky, A. M. ;
Landrum, M. B. ;
Harrington, D. P. ;
Catalano, P. .
STATISTICAL METHODS IN MEDICAL RESEARCH, 2010, 19 (06) :653-670
[9]  
Hosmer D.W., 1989, Applied logistic regression
[10]   On the bias of the multiple-imputation variance estimator in survey sampling [J].
Kim, Jae Kwang ;
Brick, J. Michael ;
Fuller, Wayne A. ;
Kalton, Graham .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2006, 68 :509-521