Penalized full likelihood approach to variable selection for Cox's regression model under nested case-control sampling

被引:1
|
作者
Wang, Jie-Huei [1 ,2 ]
Pan, Chun-Hao [2 ]
Chang, I-Shou [1 ,3 ]
Hsiung, Chao Agnes [1 ]
机构
[1] Natl Hlth Res Inst, Inst Populat Hlth Sci, Div Biostat & Bioinformat, 35 Keyan Rd, Zhunan Town 35053, Miaoli County, Taiwan
[2] Acad Sinica, Inst Stat Sci, 128 Acad Rd,Sect 2, Taipei 11529, Taiwan
[3] Natl Hlth Res Inst, Natl Inst Canc Res, 35 Keyan Rd, Zhunan Town 35053, Miaoli County, Taiwan
关键词
Nested case-control sampling; Oracle property; PNPMLE; SCAD; CASE-COHORT; CANCER; LASSO;
D O I
10.1007/s10985-019-09475-z
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Assuming Cox's regression model, we consider penalized full likelihood approach to conduct variable selection under nested case-control (NCC) sampling. Penalized non-parametric maximum likelihood estimates (PNPMLEs) are characterized by self-consistency equations derived from score functions. A cross-validation method based on profile likelihood is used to choose the tuning parameter within a family of penalty functions. Simulation studies indicate that the numerical performance of (P)NPMLE is better than weighted partial likelihood in estimating the log-relative risk and in identifying the covariates and the model, under NCC sampling. LASSO performs best when cohort size is small; SCAD performs best when cohort size is large and may eventually perform as well as the oracle estimator. Using the SCAD penalty, we establish the consistency, asymptotic normality, and oracle properties of the PNPMLE, as well as the sparsity property of the penalty. We also propose a consistent estimate of the asymptotic variance using observed profile likelihood. Our method is illustrated to analyze the diagnosis of liver cancer among those in a type 2 diabetic mellitus dataset who were treated with thiazolidinediones in Taiwan.
引用
收藏
页码:292 / 314
页数:23
相关论文
共 50 条
  • [31] Information and asymptotic efficiency of the case-cohort sampling design in Cox's regression model
    Zhang, HM
    Goldstein, L
    JOURNAL OF MULTIVARIATE ANALYSIS, 2003, 85 (02) : 292 - 317
  • [32] Bayesian model averaging: improved variable selection for matched case-control studies
    Mu, Yi
    See, Isaac
    Edwards, Jonathan R.
    EPIDEMIOLOGY BIOSTATISTICS AND PUBLIC HEALTH, 2019, 16 (02)
  • [33] Privacy-preserving analysis of time-to-event data under nested case-control sampling
    Juwara, Lamin
    Yang, Yi Archer
    Velly, Ana M.
    Saha-Chaudhuri, Paramita
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2024, 33 (01) : 96 - 111
  • [34] A semiparametric regression model under biased sampling and random censoring: A local pseudo-likelihood approach
    Rabhi, Yassir
    Asgharian, Masoud
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2021, 49 (03): : 637 - 658
  • [35] Maximum likelihood for Gaussian process classification and generalized linear mixed models under case-control sampling
    Weissbrod, Omer
    Kaufman, Shachar
    Golan, David
    Rosset, Saharon
    Journal of Machine Learning Research, 2019, 20
  • [36] Maximum Likelihood for Gaussian Process Classification and Generalized Linear Mixed Models under Case-Control Sampling
    Weissbrod, Omer
    Kaufman, Shachar
    Golan, David
    Rosset, Saharon
    JOURNAL OF MACHINE LEARNING RESEARCH, 2019, 20
  • [37] Evaluation of Cox's model and logistic regression for matched case-control data with time-dependent covariates:: a simulation study
    Leffondré, K
    Abrahamowicz, M
    Siemiatycki, J
    STATISTICS IN MEDICINE, 2003, 22 (24) : 3781 - 3794
  • [38] The Target Cohort Approach: An Extension of the Target Trial Framework to Nested Case-Control Studies with Incidence Density Sampling
    Banack, Hailey R.
    Platt, Robert W.
    Matthay, Ellicott C.
    CURRENT EPIDEMIOLOGY REPORTS, 2024, 11 (04) : 199 - 210
  • [39] Variable Selection and Prediction Using a Nested, Matched Case-Control Study: Application to Hospital Acquired Pneumonia in Stroke Patients
    Qian, Jing
    Payabvash, Seyedmehdi
    Kemmling, Andre
    Lev, Michael H.
    Schwamm, Lee H.
    Betensky, Rebecca A.
    BIOMETRICS, 2014, 70 (01) : 153 - 163
  • [40] Comparison of Cox's model versus logistic regression for case-control data with time-varying exposure:: A simulation study.
    Leffondré, K
    Abrahamowicz, M
    Siemiatycki, J
    AMERICAN JOURNAL OF EPIDEMIOLOGY, 2002, 155 (11) : s48 - s48