Two-stage sampling designs for external validation of personal risk models

被引:7
|
作者
Whittemore, Alice S. [1 ]
Halpern, Jerry [1 ]
机构
[1] Stanford Univ, Sch Med, Dept Hlth Res & Policy, Redwood Bldg, Stanford, CA 94305 USA
基金
美国国家卫生研究院;
关键词
Bootstrap; calibration; competing risks; discrimination; personal risk models; two-stage sampling; BREAST-CANCER INCIDENCE; COMPETING RISKS; CALIFORNIA TEACHERS; OVARIAN-CANCER; COHORT; PROBABILITIES;
D O I
10.1177/0962280213480420
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
We propose a cost-effective sampling design and estimating procedure for validating personal risk models using right-censored cohort data. Validation involves using each subject's covariates, as ascertained at cohort entry, in a risk model (specified independently of the data) to assign him/her a probability of an adverse outcome within a future time period. Subjects are then grouped according to the magnitudes of their assigned risks, and within each group, the mean assigned risk is compared with the probability of outcome occurrence as estimated using the follow-up data. Such validation presents two complications. First, in the presence of right-censoring, estimating the probability of developing the outcomes before death requires competing risk analysis. Second, for rare outcomes, validation using the full cohort requires assembling covariates and assigning risks to thousands of subjects. This can be costly when some covariates involve analyzing biological specimens. A two-stage sampling design addresses this problem by assembling covariates and assigning risks only to those subjects most informative for estimating key parameters. We use this design to estimate the outcome probabilities needed to evaluate model performance and we provide theoretical and bootstrap estimates of their variances. We also describe how to choose two-stage designs with minimal efficiency loss for a parameter of interest when the quantities determining optimality are unknown at the time of design. We illustrate these methods by using subjects in the California Teachers Study to validate ovarian cancer risk models. We find that a design with optimal efficiency for one performance parameter need not be so for others, and trade-offs will be required. A two-stage design that samples all outcome-positive subjects and more outcome-negative than censored subjects will perform well in most circumstances. The methods are implemented in Risk Model Assessment Program, an R program freely available at http://med.stanford.edu/epidemiology/two-stage.html.
引用
收藏
页码:1313 / 1329
页数:17
相关论文
共 50 条
  • [21] Two-stage response surface designs
    Lu, Xuan
    Wang, Xi
    Proceedings of the Fourth International Conference on Information and Management Sciences, 2005, 4 : 631 - 637
  • [22] Risk objectives in two-stage stochastic programming models
    Dupacova, Jitka
    KYBERNETIKA, 2008, 44 (02) : 227 - 242
  • [23] Calibrated estimators in two-stage sampling
    Salinas, Veronica I.
    Sedory, Stephen A.
    Singh, Sarjinder
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2019, 48 (06) : 1449 - 1469
  • [24] Two-stage complete allocation sampling
    Salehi, Mohammad M.
    Seber, George A. F.
    ENVIRONMETRICS, 2017, 28 (03)
  • [25] Two-stage adaptive cluster sampling
    Salehi, M
    Seber, GAF
    BIOMETRICS, 1997, 53 (03) : 959 - 970
  • [26] Sampling of subpopulations in two-stage surveys
    Clark, Robert Graham
    STATISTICS IN MEDICINE, 2009, 28 (29) : 3697 - 3717
  • [27] Two-stage adaptive cluster sampling
    Naddeo S.
    Pisani C.
    Statistical Methods and Applications, 2005, 14 (1) : 3 - 10
  • [28] Group sequential two-stage preference designs
    Liu, Ruyi
    Li, Fan
    Esserman, Denise
    Ryan, Mary M.
    STATISTICS IN MEDICINE, 2024, 43 (02) : 315 - 341
  • [29] Conditional Estimation in Two-stage Adaptive Designs
    Broberg, Per
    Miller, Frank
    BIOMETRICS, 2017, 73 (03) : 895 - 904
  • [30] A unified theory of two-stage adaptive designs
    Liu, Q
    Proschan, MA
    Pledger, GW
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (460) : 1034 - 1041