Internal-external cross-validation helped to evaluate the generalizability of prediction models in large clustered datasets

被引：28

作者：

Takada, Toshihiko ^{[1
]}

Nijman, Steven ^{[1
]}

Denaxas, Spiros ^{[2
,3
,4
,5
]}

Snell, Kym I. E. ^{[6
]}

Uijl, Alicia ^{[1
,7
,8
]}

Nguyen, Tri-Long ^{[1
,9
]}

Asselbergs, Folkert W. ^{[2
,10
]}

Debray, Thomas P. A. ^{[1
,2
]}

机构：

[1] Univ Utrecht, Univ Med Ctr Utrecht, Julius Ctr Hlth Sci & Primary Care, Univ Weg 100, NL-3584 CG Utrecht, Netherlands

[2] UCL, Hlth Data Res UK & Inst Hlth Informat, Gibbs Bldg,215 Euston Rd, London NW1 2BE, England

[3] Alan Turing Inst, British Lib, 96 Euston Rd, London NW1 2DB, England

[4] UCL, Univ Coll London Hosp, Biomed Res Ctr, Natl Inst Hlth Res, Suite A,1st Floor,Maple House, London W1T 7DN, England

[5] UCL, British Heart Fdn Res Accelerator, Gower St, London WC1E 6BT, England

[6] Keele Univ, Sch Med, Ctr Prognosis Res, Keele ST5 5BG, Staffs, England

[7] Karolinska Inst, Dept Med, Div Cardiol, S-17177 Stockholm, Sweden

[8] Univ Utrecht, Univ Med Ctr Utrecht, Dept Cardiol, Div Heart & Lungs, Heidelberglaan 100,POB 85500, NL-3508 GA Utrecht, Netherlands

[9] Univ Copenhagen, CSS, Dept Publ Hlth, Sect Epidemiol, Oster Farimagsgade 5, DK-1353 Copenhagen K, Denmark

[10] UCL, Inst Cardiovasc Sci, Fac Populat Hlth Sci, Gower St, London WC1E 6BT, England

来源：

JOURNAL OF CLINICAL EPIDEMIOLOGY | 2021年 / 137卷

基金：

欧盟地平线“2020”;

关键词：

Prediction model; Calibration; Discrimination; Validation; Heterogeneity; Model comparison; INCIDENT HEART-FAILURE; MULTIPLE IMPUTATION; METAANALYSIS; PERFORMANCE; BIOMARKERS; RISK;

D O I：

10.1016/j.jclinepi.2021.03.025

中图分类号：

R19 [保健组织与事业（卫生事业管理）];

学科分类号：

摘要：

Objective: To illustrate how to evaluate the need of complex strategies for developing generalizable prediction models in large clustered datasets. Study Design and Setting: We developed eight Cox regression models to estimate the risk of heart failure using a large population level dataset. These models differed in the number of predictors, the functional form of the predictor effects (non-linear effects and interaction) and the estimation method (maximum likelihood and penalization). Internal-external cross-validation was used to evaluate the models' generalizability across the included general practices. Results: Among 871,687 individuals from 225 general practices, 43,987 (5.5%) developed heart failure during a median follow-up time of 5.8 years. For discrimination, the simplest prediction model yielded a good concordance statistic, which was not much improved by adopting complex strategies. Between-practice heterogeneity in discrimination was similar in all models. For calibration, the simplest model performed satisfactorily. Although accounting for non-linear effects and interaction slightly improved the calibration slope, it also led to more heterogeneity in the observed/expected ratio. Similar results were found in a second case study involving patients with stroke. Conclusion: In large clustered datasets, prediction model studies may adopt internal-external cross-validation to evaluate the generalizability of competing models, and to identify promising modelling strategies. (c) 2021 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license ( http:// creativecommons.org/ licenses/ by/ 4.0/ )

引用

页码：83 / 91

页数：9

共 50 条

[1] Prediction models need appropriate internal, internal-external, and external validation
Steyerberg, Ewout W.
Harrell, Frank E., Jr.
JOURNAL OF CLINICAL EPIDEMIOLOGY, 2016, 69 : 245 - 247
[2] CROSS-VALIDATION PERFORMANCE OF MORTALITY PREDICTION MODELS
HADORN, DC
DRAPER, D
ROGERS, WH
KEELER, EB
BROOK, RH
STATISTICS IN MEDICINE, 1992, 11 (04) : 475 - 489
[3] Cross-Validation of Aerobic Capacity Prediction Models in Adolescents
Burns, Ryan D.
Hannon, James C.
Brusseau, Timothy A.
Eisenman, Patricia A.
Saint-Maurice, Pedro F.
Welk, Greg J.
Mahar, Matthew T.
PEDIATRIC EXERCISE SCIENCE, 2015, 27 (03) : 404 - 411
[4] Using functional traits to predict species growth trajectories, and cross-validation to evaluate these models for ecological prediction
Thomas, Freya M.
Yen, Jian D. L.
Vesk, Peter A.
ECOLOGY AND EVOLUTION, 2019, 9 (04): : 1554 - 1566
[5] Musculoskeletal Health and Work: Development and Internal-External Cross-Validation of a Model to Predict Risk of Work Absence and Presenteeism in People Seeking Primary Healthcare
Archer, Lucinda
Peat, George
Snell, Kym I. E.
Hill, Jonathan C.
Dunn, Kate M.
Foster, Nadine E.
Bishop, Annette
van der Windt, Danielle
Wynne-Jones, Gwenllian
JOURNAL OF OCCUPATIONAL REHABILITATION, 2024,
[6] Development and internal-external validation of a prediction model for premature ventricular contraction unresponsive to the medical treatment
Atici, A.
Tanboga, H. I.
Barman, H. A.
Sahin, I.
Baycan, O.
Kup, A.
Celik, M.
Demirkiran, A.
Cevik, E.
Soysal, A. U.
Karaduman, M.
Yilmaz, I.
Yilmaz, Y.
Caliskan, M.
Aras, D.
EUROPEAN HEART JOURNAL, 2023, 44
[7] Cross-Validation Approach to Evaluate Clustering Algorithms: An Experimental Study Using Multi-Label Datasets
Tarekegn A.N.
Michalak K.
Giacobini M.
SN Computer Science, 2020, 1 (5)
[8] Cross-Validation of VO2peak Prediction Models in Adolescents
Burns, Ryan D.
Hannon, James C.
Brusseau, Timothy A.
Saint-Maurice, Pedro F.
Welk, Gregory J.
Mahar, Matthew
RESEARCH QUARTERLY FOR EXERCISE AND SPORT, 2015, 86 : A9 - A10
[9] Spatial plus : A new cross-validation method to evaluate geospatial machine learning models
Wang, Yanwen
Khodadadzadeh, Mahdi
Zurita-Milla, Raul
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2023, 121
[10] Development and internal-external validation of statistical and machine learning models for breast cancer prognostication: cohort study
Clift, Ash Kieran
Dodwell, David
Lord, Simon
Petrou, Stavros
Brady, Michael
Collins, Gary S.
Hippisley-Cox, Julia
BMJ-BRITISH MEDICAL JOURNAL, 2023, 381 : e073800

← 1 2 3 4 5 →