Internal-external cross-validation helped to evaluate the generalizability of prediction models in large clustered datasets

被引:28
|
作者
Takada, Toshihiko [1 ]
Nijman, Steven [1 ]
Denaxas, Spiros [2 ,3 ,4 ,5 ]
Snell, Kym I. E. [6 ]
Uijl, Alicia [1 ,7 ,8 ]
Nguyen, Tri-Long [1 ,9 ]
Asselbergs, Folkert W. [2 ,10 ]
Debray, Thomas P. A. [1 ,2 ]
机构
[1] Univ Utrecht, Univ Med Ctr Utrecht, Julius Ctr Hlth Sci & Primary Care, Univ Weg 100, NL-3584 CG Utrecht, Netherlands
[2] UCL, Hlth Data Res UK & Inst Hlth Informat, Gibbs Bldg,215 Euston Rd, London NW1 2BE, England
[3] Alan Turing Inst, British Lib, 96 Euston Rd, London NW1 2DB, England
[4] UCL, Univ Coll London Hosp, Biomed Res Ctr, Natl Inst Hlth Res, Suite A,1st Floor,Maple House, London W1T 7DN, England
[5] UCL, British Heart Fdn Res Accelerator, Gower St, London WC1E 6BT, England
[6] Keele Univ, Sch Med, Ctr Prognosis Res, Keele ST5 5BG, Staffs, England
[7] Karolinska Inst, Dept Med, Div Cardiol, S-17177 Stockholm, Sweden
[8] Univ Utrecht, Univ Med Ctr Utrecht, Dept Cardiol, Div Heart & Lungs, Heidelberglaan 100,POB 85500, NL-3508 GA Utrecht, Netherlands
[9] Univ Copenhagen, CSS, Dept Publ Hlth, Sect Epidemiol, Oster Farimagsgade 5, DK-1353 Copenhagen K, Denmark
[10] UCL, Inst Cardiovasc Sci, Fac Populat Hlth Sci, Gower St, London WC1E 6BT, England
基金
欧盟地平线“2020”;
关键词
Prediction model; Calibration; Discrimination; Validation; Heterogeneity; Model comparison; INCIDENT HEART-FAILURE; MULTIPLE IMPUTATION; METAANALYSIS; PERFORMANCE; BIOMARKERS; RISK;
D O I
10.1016/j.jclinepi.2021.03.025
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Objective: To illustrate how to evaluate the need of complex strategies for developing generalizable prediction models in large clustered datasets. Study Design and Setting: We developed eight Cox regression models to estimate the risk of heart failure using a large population level dataset. These models differed in the number of predictors, the functional form of the predictor effects (non-linear effects and interaction) and the estimation method (maximum likelihood and penalization). Internal-external cross-validation was used to evaluate the models' generalizability across the included general practices. Results: Among 871,687 individuals from 225 general practices, 43,987 (5.5%) developed heart failure during a median follow-up time of 5.8 years. For discrimination, the simplest prediction model yielded a good concordance statistic, which was not much improved by adopting complex strategies. Between-practice heterogeneity in discrimination was similar in all models. For calibration, the simplest model performed satisfactorily. Although accounting for non-linear effects and interaction slightly improved the calibration slope, it also led to more heterogeneity in the observed/expected ratio. Similar results were found in a second case study involving patients with stroke. Conclusion: In large clustered datasets, prediction model studies may adopt internal-external cross-validation to evaluate the generalizability of competing models, and to identify promising modelling strategies. (c) 2021 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license ( http:// creativecommons.org/ licenses/ by/ 4.0/ )
引用
收藏
页码:83 / 91
页数:9
相关论文
共 50 条
  • [21] DEVELOPMENT AND INTERNAL-EXTERNAL VALIDATION OF A DYNAMIC MULTIVARIABLE PREDICTION MODEL FOR ADVANCED COLORECTAL NEOPLASIA IN PATIENTS WITH INFLAMMATORY BOWEL DISEASE
    Wijnands, Anouk
    de Vries, Bas Penning
    Lutgens, Maurice
    Bakhshi, Zeinab
    Al Bakir, Ibrahim
    Beaugerie, Laurent
    Bernstein, Charles N.
    Choi, Chang Ho R.
    Coelho-Prabhu, Nayantara
    Graham, Trevor A.
    Hart, Ailsa
    ten Hove, Joren
    Itzkowitz, Steven
    Kirchgesner, Julien
    Mooiweer, Erik
    Shaffer, Seth R.
    Shah, Shailja
    Elias, Sjoerd G.
    Oldenburg, Bas
    GASTROENTEROLOGY, 2024, 166 (05) : S537 - S537
  • [22] Total-body skeletal muscle mass: development and cross-validation of anthropometric prediction models
    Lee, RC
    Wang, ZM
    Heo, MS
    Ross, R
    Janssen, I
    Heymsfield, SB
    AMERICAN JOURNAL OF CLINICAL NUTRITION, 2000, 72 (03): : 796 - 803
  • [23] Penalized Regression Methods With Modified Cross-Validation and Bootstrap Tuning Produce Better Prediction Models
    Pavlou, Menelaos
    Omar, Rumana Z.
    Ambler, Gareth
    Biometrical Journal, 66 (05):
  • [24] Penalized Regression Methods With Modified Cross-Validation and Bootstrap Tuning Produce Better Prediction Models
    Pavlou, Menelaos
    Omar, Rumana Z.
    Ambler, Gareth
    BIOMETRICAL JOURNAL, 2024, 66 (05)
  • [25] A Large-Scale Empirical Evaluation of Cross-Validation and External Test Set Validation in (Q) SAR (vol 32, 2013)
    Guetlein, Martin
    Helma, Christoph
    Karwath, Andreas
    Kramer, Stefan
    MOLECULAR INFORMATICS, 2013, 32 (9-10) : 866 - 866
  • [26] Generalizability of Dutch Prediction Models for Low Hemoglobin Deferral: A Study on External Validation and Updating in Swiss Whole Blood Donors
    Baart, A. Mireille
    Fontana, Stefano
    Tschaggelar, Anita
    Heymans, Martijn W.
    de Kort, Wim L. A. M.
    TRANSFUSION MEDICINE AND HEMOTHERAPY, 2016, 43 (06) : 407 - 414
  • [27] Structure-independent cross-validation between residual dipolar couplings originating from internal and external orienting media
    Barbieri, R
    Bertini, I
    Lee, YM
    Luchinat, C
    Velders, AH
    JOURNAL OF BIOMOLECULAR NMR, 2002, 22 (04) : 365 - 368
  • [28] Reliable estimation of prediction errors for QSAR models under model uncertainty using double cross-validation
    Désirée Baumann
    Knut Baumann
    Journal of Cheminformatics, 6
  • [29] Reliable estimation of prediction errors for QSAR models under model uncertainty using double cross-validation
    Baumann, Desiree
    Baumann, Knut
    JOURNAL OF CHEMINFORMATICS, 2014, 6
  • [30] Structure-independent cross-validation between residual dipolar couplings originating from internal and external orienting media
    Renato Barbieri
    Ivano Bertini
    Yong-Min Lee
    Claudio Luchinat
    Aldrik H. Velders
    Journal of Biomolecular NMR, 2002, 22 : 365 - 368