Penalized full likelihood approach to variable selection for Cox's regression model under nested case-control sampling

被引:1
|
作者
Wang, Jie-Huei [1 ,2 ]
Pan, Chun-Hao [2 ]
Chang, I-Shou [1 ,3 ]
Hsiung, Chao Agnes [1 ]
机构
[1] Natl Hlth Res Inst, Inst Populat Hlth Sci, Div Biostat & Bioinformat, 35 Keyan Rd, Zhunan Town 35053, Miaoli County, Taiwan
[2] Acad Sinica, Inst Stat Sci, 128 Acad Rd,Sect 2, Taipei 11529, Taiwan
[3] Natl Hlth Res Inst, Natl Inst Canc Res, 35 Keyan Rd, Zhunan Town 35053, Miaoli County, Taiwan
关键词
Nested case-control sampling; Oracle property; PNPMLE; SCAD; CASE-COHORT; CANCER; LASSO;
D O I
10.1007/s10985-019-09475-z
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Assuming Cox's regression model, we consider penalized full likelihood approach to conduct variable selection under nested case-control (NCC) sampling. Penalized non-parametric maximum likelihood estimates (PNPMLEs) are characterized by self-consistency equations derived from score functions. A cross-validation method based on profile likelihood is used to choose the tuning parameter within a family of penalty functions. Simulation studies indicate that the numerical performance of (P)NPMLE is better than weighted partial likelihood in estimating the log-relative risk and in identifying the covariates and the model, under NCC sampling. LASSO performs best when cohort size is small; SCAD performs best when cohort size is large and may eventually perform as well as the oracle estimator. Using the SCAD penalty, we establish the consistency, asymptotic normality, and oracle properties of the PNPMLE, as well as the sparsity property of the penalty. We also propose a consistent estimate of the asymptotic variance using observed profile likelihood. Our method is illustrated to analyze the diagnosis of liver cancer among those in a type 2 diabetic mellitus dataset who were treated with thiazolidinediones in Taiwan.
引用
收藏
页码:292 / 314
页数:23
相关论文
共 50 条