Internal-external cross-validation helped to evaluate the generalizability of prediction models in large clustered datasets

被引:28
|
作者
Takada, Toshihiko [1 ]
Nijman, Steven [1 ]
Denaxas, Spiros [2 ,3 ,4 ,5 ]
Snell, Kym I. E. [6 ]
Uijl, Alicia [1 ,7 ,8 ]
Nguyen, Tri-Long [1 ,9 ]
Asselbergs, Folkert W. [2 ,10 ]
Debray, Thomas P. A. [1 ,2 ]
机构
[1] Univ Utrecht, Univ Med Ctr Utrecht, Julius Ctr Hlth Sci & Primary Care, Univ Weg 100, NL-3584 CG Utrecht, Netherlands
[2] UCL, Hlth Data Res UK & Inst Hlth Informat, Gibbs Bldg,215 Euston Rd, London NW1 2BE, England
[3] Alan Turing Inst, British Lib, 96 Euston Rd, London NW1 2DB, England
[4] UCL, Univ Coll London Hosp, Biomed Res Ctr, Natl Inst Hlth Res, Suite A,1st Floor,Maple House, London W1T 7DN, England
[5] UCL, British Heart Fdn Res Accelerator, Gower St, London WC1E 6BT, England
[6] Keele Univ, Sch Med, Ctr Prognosis Res, Keele ST5 5BG, Staffs, England
[7] Karolinska Inst, Dept Med, Div Cardiol, S-17177 Stockholm, Sweden
[8] Univ Utrecht, Univ Med Ctr Utrecht, Dept Cardiol, Div Heart & Lungs, Heidelberglaan 100,POB 85500, NL-3508 GA Utrecht, Netherlands
[9] Univ Copenhagen, CSS, Dept Publ Hlth, Sect Epidemiol, Oster Farimagsgade 5, DK-1353 Copenhagen K, Denmark
[10] UCL, Inst Cardiovasc Sci, Fac Populat Hlth Sci, Gower St, London WC1E 6BT, England
基金
欧盟地平线“2020”;
关键词
Prediction model; Calibration; Discrimination; Validation; Heterogeneity; Model comparison; INCIDENT HEART-FAILURE; MULTIPLE IMPUTATION; METAANALYSIS; PERFORMANCE; BIOMARKERS; RISK;
D O I
10.1016/j.jclinepi.2021.03.025
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Objective: To illustrate how to evaluate the need of complex strategies for developing generalizable prediction models in large clustered datasets. Study Design and Setting: We developed eight Cox regression models to estimate the risk of heart failure using a large population level dataset. These models differed in the number of predictors, the functional form of the predictor effects (non-linear effects and interaction) and the estimation method (maximum likelihood and penalization). Internal-external cross-validation was used to evaluate the models' generalizability across the included general practices. Results: Among 871,687 individuals from 225 general practices, 43,987 (5.5%) developed heart failure during a median follow-up time of 5.8 years. For discrimination, the simplest prediction model yielded a good concordance statistic, which was not much improved by adopting complex strategies. Between-practice heterogeneity in discrimination was similar in all models. For calibration, the simplest model performed satisfactorily. Although accounting for non-linear effects and interaction slightly improved the calibration slope, it also led to more heterogeneity in the observed/expected ratio. Similar results were found in a second case study involving patients with stroke. Conclusion: In large clustered datasets, prediction model studies may adopt internal-external cross-validation to evaluate the generalizability of competing models, and to identify promising modelling strategies. (c) 2021 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license ( http:// creativecommons.org/ licenses/ by/ 4.0/ )
引用
收藏
页码:83 / 91
页数:9
相关论文
共 50 条
  • [31] Predicting the risk of pancreatic cancer in adults with new-onset diabetes: development and internal-external validation of a clinical risk prediction model
    Clift, Ash Kieran
    Tan, Pui San
    Patone, Martina
    Liao, Weiqi
    Coupland, Carol
    Bashford-Rogers, Rachael
    Sivakumar, Shivan
    Hippisley-Cox, Julia
    BRITISH JOURNAL OF CANCER, 2024, 130 (12) : 1969 - 1978
  • [32] Joint use of over- and under-sampling techniques and cross-validation for the development and assessment of prediction models
    Blagus, Rok
    Lusa, Lara
    BMC BIOINFORMATICS, 2015, 16
  • [33] Joint use of over- and under-sampling techniques and cross-validation for the development and assessment of prediction models
    Rok Blagus
    Lara Lusa
    BMC Bioinformatics, 16
  • [34] Development and internal-external validation of the ATHE Scale: predicting acute large vessel occlusion due to underlying intracranial atherosclerosis prior to endovascular treatment
    Chen, Wang
    Liu, Ji
    Yang, Lei
    Sun, Hongyang
    Yang, Shuna
    Wang, Mengen
    Qin, Wei
    Wang, Yang
    Wang, Xianjun
    Hu, Wenli
    JOURNAL OF NEUROSURGERY, 2023, 141 (01) : 165 - 174
  • [35] Explanations of Machine Learning Models in Repeated Nested Cross-Validation: An Application in Age Prediction Using Brain Complexity Features
    Scheda, Riccardo
    Diciotti, Stefano
    APPLIED SCIENCES-BASEL, 2022, 12 (13):
  • [36] COLOFIT : Development and Internal-External Validation of Models Using Age, Sex, Faecal Immunochemical and Blood Tests to Optimise Diagnosis of Colorectal Cancer in Symptomatic Patients
    Crooks, C. J.
    West, J.
    Jones, J.
    Hamilton, W.
    Bailey, S. E. R.
    Abel, G.
    Banerjea, A.
    Rees, C. J.
    Tamm, A.
    Nicholson, B. D.
    Benton, S. C.
    Hunt, N.
    COLOFIT Res Grp, D. J.
    Humes, D. J.
    ALIMENTARY PHARMACOLOGY & THERAPEUTICS, 2025, 61 (05) : 852 - 864
  • [37] Imaging bolometers for visualization of plasma radiation and cross-validation of three dimensional impurity transport models for the large helical device
    Peterson, B.J. (peterson@LHD.nifs.ac.jp), 1600, Japan Society of Plasma Science and Nuclear Fusion Research (08):
  • [38] A simulation study to compare cross-validation versus holdout or external testing to assess the performance of machine learning based clinical prediction rules
    Boellaard, R.
    Eertink, J. J.
    Lugtenburg, P. J.
    Zwezerijnen, G. J.
    Wiegers, S. E.
    de Vet, H. C.
    Zijlstra, J. M.
    EUROPEAN JOURNAL OF NUCLEAR MEDICINE AND MOLECULAR IMAGING, 2021, 48 (SUPPL 1) : S285 - S285
  • [39] Dynamic and Transdiagnostic Risk Calculator Based on Natural Language Processing for the Prediction of Psychosis in Secondary Mental Health Care: Development and Internal-External Validation Cohort Study
    Krakowski, Kamil
    Oliver, Dominic
    Arribas, Maite
    Stahl, Daniel
    Fusar-Poli, Paolo
    BIOLOGICAL PSYCHIATRY, 2024, 96 (07) : 604 - 614
  • [40] Development and internal validation of machine learning-based models and external validation of existing risk scores for outcome prediction in patients with ischaemic stroke
    Axford, Daniel
    Sohel, Ferdous
    Abedi, Vida
    Zhu, Ye
    Zand, Ramin
    Barkoudah, Ebrahim
    Krupica, Troy
    Iheasirim, Kingsley
    Sharma, Umesh M.
    Dugani, Sagar B.
    Takahashi, Paul Y.
    Bhagra, Sumit
    Murad, Mohammad H.
    Saposnik, Gustavo
    Yousufuddin, Mohammed
    EUROPEAN HEART JOURNAL - DIGITAL HEALTH, 2024, 5 (02): : 109 - 122