The estimation and use of predictions for the assessment of model performance using large samples with multiply imputed data

被引：54

作者：

Wood, Angela M. ^{[1
]}

Royston, Patrick ^{[2
]}

White, Ian R. ^{[3
]}

机构：

[1] Univ Cambridge, Dept Publ Hlth & Primary Care, Strangeways Res Lab, Cambridge CB1 8RN, England

[2] UCL, MRC, Clin Trials Unit, London WC2B 6NH, England

[3] Cambridge Inst Publ Hlth, MRC, Biostat Unit, Cambridge CB2 0SR, England

来源：

BIOMETRICAL JOURNAL | 2015年 / 57卷 / 04期

关键词：

Measures of model performance; Missing data; Model validation; Multiple imputation; Prediction models; Rubin's rules; DISEASE RISK SCORE; EXTERNAL VALIDATION; PROGNOSTIC MODELS; MISSING-DATA; IMPUTATION; CANCER; VALUES; QRISK;

D O I：

10.1002/bimj.201400004

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Multiple imputation can be used as a tool in the process of constructing prediction models in medical and epidemiological studies with missing covariate values. Such models can be used to make predictions for model performance assessment, but the task is made more complicated by the multiple imputation structure. We summarize various predictions constructed from covariates, including multiply imputed covariates, and either the set of imputation-specific prediction model coefficients or the pooled prediction model coefficients. We further describe approaches for using the predictions to assess model performance. We distinguish between ideal model performance and pragmatic model performance, where the former refers to the model's performance in an ideal clinical setting where all individuals have fully observed predictors and the latter refers to the model's performance in a real-world clinical setting where some individuals have missing predictors. The approaches are compared through an extensive simulation study based on the UK700 trial. We determine that measures of ideal model performance can be estimated within imputed datasets and subsequently pooled to give an overall measure of model performance. Alternative methods to evaluate pragmatic model performance are required and we propose constructing predictions either from a second set of covariate imputations which make no use of observed outcomes, or from a set of partial prediction models constructed for each potential observed pattern of covariate. Pragmatic model performance is generally lower than ideal model performance. We focus on model performance within the derivation data, but describe how to extend all the methods to a validation dataset.

引用

页码：614 / 632

页数：19

共 50 条

[31] Using physicochemical data and predictions in the risk assessment of mutagenic impurities
Stalford, Susanne
ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2016, 251
[32] Performance Assessment of Photovoltaic Power Predictions using Univariate Models
Lim, P. Y.
Wong, Farrah
2015 IEEE CONFERENCE ON ENERGY CONVERSION (CENCON), 2015, : 403 - 407
[33] Optimal Estimation of Large Functional and Longitudinal Data by Using Functional Linear Mixed Model
Ran, Mengfei
Yang, Yihe
MATHEMATICS, 2022, 10 (22)
[34] IMPACT ASSESSMENT OF MISSING DATA IN MODEL PREDICTIONS FOR EARTH OBSERVATION APPLICATIONS
Mena, Francisco
Arenas, Diego
Charfuelan, Marcela
Nuske, Marlon
Denge, Andreas
IGARSS 2024-2024 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, IGARSS 2024, 2024, : 967 - 971
[35] Using tree hollow data to define large tree size for use in habitat assessment
Travers, S. K.
Dorrough, J.
Oliver, I.
Somerville, M.
Watson, C. J.
McNellie, M. J.
AUSTRALIAN FORESTRY, 2018, 81 (03) : 186 - 195
[36] ESTIMATION OF THE NETWORK EFFECTS MODEL IN A LARGE DATA SET
DUKE, JB
SOCIOLOGICAL METHODS & RESEARCH, 1993, 21 (04) : 465 - 481
[37] Comparing the Estimation Performance of the EPCU Model with the Expert Judgment Estimation Approach Using Data from Industry
Valdes, Francisco
Abran, Alain
SOFTWARE ENGINEERING RESEARCH, MANAGEMENT AND APPLICATIONS 2010, 2010, 296 : 227 - 240
[38] Use of Mohr's Circles for Connection and Model Estimation of Strength Data of Different-Size Rock Samples
Tsoi, P. A.
Usol'tseva, O. M.
JOURNAL OF MINING SCIENCE, 2019, 55 (02) : 194 - 200
[39] Use of Mohr’s Circles for Connection and Model Estimation of Strength Data of Different-Size Rock Samples
P. A. Tsoi
O. M. Usol’tseva
Journal of Mining Science, 2019, 55 : 194 - 200
[40] Estimation of primaries by sparse inversion with scattering-based multiple predictions for data with large gaps
Lin, Tim T. Y.
Herrmann, Felix J.
GEOPHYSICS, 2016, 81 (03) : V183 - V197

← 1 2 3 4 5 →