Bayesian Subset Modeling for High-Dimensional Generalized Linear Models

被引:49
|
作者
Liang, Faming [1 ]
Song, Qifan [1 ]
Yu, Kai [2 ]
机构
[1] Texas A&M Univ, Dept Stat, College Stn, TX 77843 USA
[2] NCI, Div Canc Epidemiol & Genet, Rockville, MD 20892 USA
基金
美国国家科学基金会;
关键词
Bayesian classification; Posterior consistency; Stochastic approximation Monte Carlo; Sure variable screening; Variable selection; VARIABLE-SELECTION; STOCHASTIC-APPROXIMATION; MONTE-CARLO; DISCOVERY; REGRESSION; REGULARIZATION; CONVERGENCE; CONSISTENCY; LIKELIHOOD; SEARCH;
D O I
10.1080/01621459.2012.761942
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This article presents a new prior setting for high-dimensional generalized linear models, which leads to a Bayesian subset regression (BSR) with the maximum a posteriori model approximately equivalent to the minimum extended Bayesian information criterion model. The consistency of the resulting posterior is established under mild conditions. Further, a variable screening procedure is proposed based on the marginal inclusion probability, which shares the same properties of sure screening and consistency with the existing sure independence screening (SIS) and iterative sure independence screening (ISIS) procedures. However, since the proposed procedure makes use of joint information from all predictors, it generally outperforms SIS and ISIS in real applications. This article also makes extensive comparisons of BSR with the popular penalized likelihood methods, including Lasso, elastic net, SIS, and ISIS. The numerical results indicate that BSR can generally outperform the penalized likelihood methods. The models selected by BSR tend to be sparser and, more importantly, of higher prediction ability. In addition, the performance of the penalized likelihood methods tends to deteriorate as the number of predictors increases, while this is not significant for BSR. Supplementary materials for this article are available online.
引用
收藏
页码:589 / 606
页数:18
相关论文
共 50 条