Variable selection for high-dimensional genomic data with censored outcomes using group lasso prior

被引:8
|
作者
Lee, Kyu Ha [1 ,2 ]
Chakraborty, Sounak [3 ]
Sun, Jianguo [3 ]
机构
[1] Forsyth Inst, Epidemiol & Biostat Core, Cambridge, MA USA
[2] Harvard Sch Dent Med, Dept Oral Hlth Policy & Epidemiol, Boston, MA USA
[3] Univ Missouri, Dept Stat, Columbia, MO 65211 USA
基金
美国国家科学基金会;
关键词
Accelerated failure time model; Bayesian lasso; Gibbs sampler; Group lasso; Penalized regression; FAILURE TIME MODEL; MICROARRAY DATA; SURVIVAL ANALYSIS; HAZARD RATIOS; ELASTIC NET; COX MODEL; REGRESSION; PREDICTION; SHRINKAGE;
D O I
10.1016/j.csda.2017.02.014
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The variable selection problem is discussed in the context of high-dimensional failure time data arising from the accelerated failure time model. A data augmentation approach is employed in order to deal with censored survival times and to facilitate prior-posterior conjugacy. To identify a set of grouped relevant covariates, a shrinkage prior distribution is specified for regression coefficients mimicking the effect of group lasso penalty. It is noted that unlike the corresponding frequentist method, a Bayesian penalized regression approach cannot shrink the estimates of coefficients to exact zeros in general. Towards resolving the issue, a two-stage thresholding method that exploits the scaled neighbor-hood criterion and the Bayesian information criterion is devised. Simulation studies are performed to assess the robustness and performance of the proposed method in terms of variable selection accuracy and predictive power. The method is successfully applied to a set of microarray data on the individuals diagnosed with diffuse large B-cell lymphoma. In addition, an R package called psbcGroup, which can be downloaded freely from CRAN, is developed for the implementation of the methods. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 13
页数:13
相关论文
共 50 条
  • [31] A study on tuning parameter selection for the high-dimensional lasso
    Homrighausen, Darren
    McDonald, Daniel J.
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2018, 88 (15) : 2865 - 2892
  • [32] Bayesian Variable Selection in Clustering High-Dimensional Data With Substructure
    Swartz, Michael D.
    Mo, Qianxing
    Murphy, Mary E.
    Lupton, Joanne R.
    Turner, Nancy D.
    Hong, Mee Young
    Vannucci, Marina
    JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS, 2008, 13 (04) : 407 - 423
  • [33] Stochastic variational variable selection for high-dimensional microbiome data
    Dang, Tung
    Kumaishi, Kie
    Usui, Erika
    Kobori, Shungo
    Sato, Takumi
    Toda, Yusuke
    Yamasaki, Yuji
    Tsujimoto, Hisashi
    Ichihashi, Yasunori
    Iwata, Hiroyoshi
    MICROBIOME, 2022, 10 (01)
  • [34] High-dimensional variable selection in regression and classification with missing data
    Gao, Qi
    Lee, Thomas C. M.
    SIGNAL PROCESSING, 2017, 131 : 1 - 7
  • [35] Scalable Bayesian variable selection for structured high-dimensional data
    Chang, Changgee
    Kundu, Suprateek
    Long, Qi
    BIOMETRICS, 2018, 74 (04) : 1372 - 1382
  • [36] Sparse Bayesian variable selection for classifying high-dimensional data
    Yang, Aijun
    Lian, Heng
    Jiang, Xuejun
    Liu, Pengfei
    STATISTICS AND ITS INTERFACE, 2018, 11 (02) : 385 - 395
  • [37] RANKING-BASED VARIABLE SELECTION FOR HIGH-DIMENSIONAL DATA
    Baranowski, Rafal
    Chen, Yining
    Fryzlewicz, Piotr
    STATISTICA SINICA, 2020, 30 (03) : 1485 - 1516
  • [38] Bayesian variable selection in clustering high-dimensional data with substructure
    Michael D. Swartz
    Qianxing Mo
    Mary E. Murphy
    Joanne R. Lupton
    Nancy D. Turner
    Mee Young Hong
    Marina Vannucci
    Journal of Agricultural, Biological, and Environmental Statistics, 2008, 13 : 407 - 423
  • [39] A Robust Supervised Variable Selection for Noisy High-Dimensional Data
    Kalina, Jan
    Schlenker, Anna
    BIOMED RESEARCH INTERNATIONAL, 2015, 2015
  • [40] Estimation and variable selection for high-dimensional spatial data models
    Hou, Li
    Jin, Baisuo
    Wu, Yuehua
    JOURNAL OF ECONOMETRICS, 2024, 238 (02)