Internal-external cross-validation helped to evaluate the generalizability of prediction models in large clustered datasets

被引:28
|
作者
Takada, Toshihiko [1 ]
Nijman, Steven [1 ]
Denaxas, Spiros [2 ,3 ,4 ,5 ]
Snell, Kym I. E. [6 ]
Uijl, Alicia [1 ,7 ,8 ]
Nguyen, Tri-Long [1 ,9 ]
Asselbergs, Folkert W. [2 ,10 ]
Debray, Thomas P. A. [1 ,2 ]
机构
[1] Univ Utrecht, Univ Med Ctr Utrecht, Julius Ctr Hlth Sci & Primary Care, Univ Weg 100, NL-3584 CG Utrecht, Netherlands
[2] UCL, Hlth Data Res UK & Inst Hlth Informat, Gibbs Bldg,215 Euston Rd, London NW1 2BE, England
[3] Alan Turing Inst, British Lib, 96 Euston Rd, London NW1 2DB, England
[4] UCL, Univ Coll London Hosp, Biomed Res Ctr, Natl Inst Hlth Res, Suite A,1st Floor,Maple House, London W1T 7DN, England
[5] UCL, British Heart Fdn Res Accelerator, Gower St, London WC1E 6BT, England
[6] Keele Univ, Sch Med, Ctr Prognosis Res, Keele ST5 5BG, Staffs, England
[7] Karolinska Inst, Dept Med, Div Cardiol, S-17177 Stockholm, Sweden
[8] Univ Utrecht, Univ Med Ctr Utrecht, Dept Cardiol, Div Heart & Lungs, Heidelberglaan 100,POB 85500, NL-3508 GA Utrecht, Netherlands
[9] Univ Copenhagen, CSS, Dept Publ Hlth, Sect Epidemiol, Oster Farimagsgade 5, DK-1353 Copenhagen K, Denmark
[10] UCL, Inst Cardiovasc Sci, Fac Populat Hlth Sci, Gower St, London WC1E 6BT, England
基金
欧盟地平线“2020”;
关键词
Prediction model; Calibration; Discrimination; Validation; Heterogeneity; Model comparison; INCIDENT HEART-FAILURE; MULTIPLE IMPUTATION; METAANALYSIS; PERFORMANCE; BIOMARKERS; RISK;
D O I
10.1016/j.jclinepi.2021.03.025
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Objective: To illustrate how to evaluate the need of complex strategies for developing generalizable prediction models in large clustered datasets. Study Design and Setting: We developed eight Cox regression models to estimate the risk of heart failure using a large population level dataset. These models differed in the number of predictors, the functional form of the predictor effects (non-linear effects and interaction) and the estimation method (maximum likelihood and penalization). Internal-external cross-validation was used to evaluate the models' generalizability across the included general practices. Results: Among 871,687 individuals from 225 general practices, 43,987 (5.5%) developed heart failure during a median follow-up time of 5.8 years. For discrimination, the simplest prediction model yielded a good concordance statistic, which was not much improved by adopting complex strategies. Between-practice heterogeneity in discrimination was similar in all models. For calibration, the simplest model performed satisfactorily. Although accounting for non-linear effects and interaction slightly improved the calibration slope, it also led to more heterogeneity in the observed/expected ratio. Similar results were found in a second case study involving patients with stroke. Conclusion: In large clustered datasets, prediction model studies may adopt internal-external cross-validation to evaluate the generalizability of competing models, and to identify promising modelling strategies. (c) 2021 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license ( http:// creativecommons.org/ licenses/ by/ 4.0/ )
引用
收藏
页码:83 / 91
页数:9
相关论文
共 50 条
  • [1] Prediction models need appropriate internal, internal-external, and external validation
    Steyerberg, Ewout W.
    Harrell, Frank E., Jr.
    JOURNAL OF CLINICAL EPIDEMIOLOGY, 2016, 69 : 245 - 247
  • [2] CROSS-VALIDATION PERFORMANCE OF MORTALITY PREDICTION MODELS
    HADORN, DC
    DRAPER, D
    ROGERS, WH
    KEELER, EB
    BROOK, RH
    STATISTICS IN MEDICINE, 1992, 11 (04) : 475 - 489
  • [3] Cross-Validation of Aerobic Capacity Prediction Models in Adolescents
    Burns, Ryan D.
    Hannon, James C.
    Brusseau, Timothy A.
    Eisenman, Patricia A.
    Saint-Maurice, Pedro F.
    Welk, Greg J.
    Mahar, Matthew T.
    PEDIATRIC EXERCISE SCIENCE, 2015, 27 (03) : 404 - 411
  • [4] Using functional traits to predict species growth trajectories, and cross-validation to evaluate these models for ecological prediction
    Thomas, Freya M.
    Yen, Jian D. L.
    Vesk, Peter A.
    ECOLOGY AND EVOLUTION, 2019, 9 (04): : 1554 - 1566
  • [5] Musculoskeletal Health and Work: Development and Internal-External Cross-Validation of a Model to Predict Risk of Work Absence and Presenteeism in People Seeking Primary Healthcare
    Archer, Lucinda
    Peat, George
    Snell, Kym I. E.
    Hill, Jonathan C.
    Dunn, Kate M.
    Foster, Nadine E.
    Bishop, Annette
    van der Windt, Danielle
    Wynne-Jones, Gwenllian
    JOURNAL OF OCCUPATIONAL REHABILITATION, 2024,
  • [6] Development and internal-external validation of a prediction model for premature ventricular contraction unresponsive to the medical treatment
    Atici, A.
    Tanboga, H. I.
    Barman, H. A.
    Sahin, I.
    Baycan, O.
    Kup, A.
    Celik, M.
    Demirkiran, A.
    Cevik, E.
    Soysal, A. U.
    Karaduman, M.
    Yilmaz, I.
    Yilmaz, Y.
    Caliskan, M.
    Aras, D.
    EUROPEAN HEART JOURNAL, 2023, 44
  • [7] Cross-Validation Approach to Evaluate Clustering Algorithms: An Experimental Study Using Multi-Label Datasets
    Tarekegn A.N.
    Michalak K.
    Giacobini M.
    SN Computer Science, 2020, 1 (5)
  • [8] Cross-Validation of VO2peak Prediction Models in Adolescents
    Burns, Ryan D.
    Hannon, James C.
    Brusseau, Timothy A.
    Saint-Maurice, Pedro F.
    Welk, Gregory J.
    Mahar, Matthew
    RESEARCH QUARTERLY FOR EXERCISE AND SPORT, 2015, 86 : A9 - A10
  • [9] Spatial plus : A new cross-validation method to evaluate geospatial machine learning models
    Wang, Yanwen
    Khodadadzadeh, Mahdi
    Zurita-Milla, Raul
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2023, 121
  • [10] Development and internal-external validation of statistical and machine learning models for breast cancer prognostication: cohort study
    Clift, Ash Kieran
    Dodwell, David
    Lord, Simon
    Petrou, Stavros
    Brady, Michael
    Collins, Gary S.
    Hippisley-Cox, Julia
    BMJ-BRITISH MEDICAL JOURNAL, 2023, 381 : e073800