Variable selection in semiparametric regression modeling

被引:268
|
作者
Li, Runze [1 ,2 ]
Liang, Hua [3 ]
机构
[1] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
[2] Penn State Univ, Method Ctr, University Pk, PA 16802 USA
[3] Univ Rochester, Dept Biostat & Computat Biol, Rochester, NY 14642 USA
来源
ANNALS OF STATISTICS | 2008年 / 36卷 / 01期
关键词
local linear regression; nonconcave penalized likelihood; SCAD; varying coefficient models;
D O I
10.1214/009053607000000604
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, we are concerned with how to select significant variables in semiparametric modeling. Variable selection for semiparametric regression models consists of two components: model selection for nonparametric components and selection of significant variables for the parametric portion. Thus, semiparametric variable selection is much more challenging than parametric variable selection (e.g., linear and generalized linear models) because traditional variable selection procedures including stepwise regression and the best subset selection now require separate model selection for the nonparametric components for each submodel. This leads to a very heavy computational burden. In this paper, we propose a class of variable selection procedures for semiparametric regression models using nonconcave penalized likelihood. We establish the rate of convergence of the resulting estimate. With proper choices of penalty functions and regularization parameters, we show the asymptotic normality of the resulting estimate and further demonstrate that the proposed procedures perform as well as an oracle procedure. A semiparametric generalized likelihood ratio test is proposed to select significant variables in the nonparametric component. We investigate the asymptotic behavior of the proposed test and demonstrate that its limiting null distribution follows a chi-square distribution which is independent of the nuisance parameters. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed variable selection procedures.
引用
收藏
页码:261 / 286
页数:26
相关论文
共 50 条
  • [41] Bayesian semiparametric variable selection with applications to periodontal data
    Cai, Bo
    Bandyopadhyay, Dipankar
    STATISTICS IN MEDICINE, 2017, 36 (14) : 2251 - 2264
  • [42] Variable Selection for Semiparametric Mixed Models in Longitudinal Studies
    Ni, Xiao
    Zhang, Daowen
    Zhang, Hao Helen
    BIOMETRICS, 2010, 66 (01) : 79 - 88
  • [43] On variable selection in a semiparametric AFT mixture cure model
    Parsa, Motahareh
    Taghavi-Shahri, Seyed Mahmood
    Van Keilegom, Ingrid
    LIFETIME DATA ANALYSIS, 2024, 30 (02) : 472 - 500
  • [44] Automatic variable selection for semiparametric spatial autoregressive model
    Lu, Fang
    Liu, Sisheng
    Yang, Jing
    Lu, Xuewen
    ECONOMETRIC REVIEWS, 2023, 42 (08) : 655 - 675
  • [45] On variable selection in a semiparametric AFT mixture cure model
    Motahareh Parsa
    Seyed Mahmood Taghavi-Shahri
    Ingrid Van Keilegom
    Lifetime Data Analysis, 2024, 30 : 472 - 500
  • [46] Semiparametric estimation of a regression model with an unknown transformation of the dependent variable
    Horowitz, JL
    ECONOMETRICA, 1996, 64 (01) : 103 - 137
  • [47] Semiparametric Bayesian latent variable regression for skewed multivariate data
    Bhingare, Apurva
    Sinha, Debajyoti
    Pati, Debdeep
    Bandyopadhyay, Dipankar
    Lipsitz, Stuart R.
    BIOMETRICS, 2019, 75 (02) : 528 - 538
  • [48] Variable selection in expectile regression
    Zhao, Jun
    Zhang, Yi
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2018, 47 (07) : 1731 - 1746
  • [49] ON VARIABLE SELECTION IN MULTIVARIATE REGRESSION
    SPARKS, RS
    ZUCCHINI, W
    COUTSOURIDES, D
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1985, 14 (07) : 1569 - 1587
  • [50] On variable selection in linear regression
    Kabaila, P
    ECONOMETRIC THEORY, 2002, 18 (04) : 913 - 925