Bayesian feature selection in high-dimensional regression in presence of correlated noise

被引:1
|
作者
Feldman, Guy [1 ]
Bhadra, Anindya [1 ]
Kirshner, Sergey [1 ]
机构
[1] Purdue Univ, Dept Stat, 250 N Univ St, W Lafayette, IN 47907 USA
来源
STAT | 2014年 / 3卷 / 01期
关键词
Bayesian methods; genomics; graphical models; high-dimensional data; variable selection;
D O I
10.1002/sta4.60
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider the problem of feature selection in a high-dimensional multiple predictors, multiple responses regression setting. Assuming that regression errors are i.i.d. when they are in fact dependent leads to inconsistent and inefficient feature estimates. We relax the i.i.d. assumption by allowing the errors to exhibit a tree-structured dependence. This allows a Bayesian problem formulation with the error dependence structure treated as an auxiliary variable that can be integrated out analytically with the help of the matrix-tree theorem. Mixing over trees results in a flexible technique for modelling the graphical structure for the regression errors. Furthermore, the analytic integration results in a collapsed Gibbs sampler for feature selection that is computationally efficient. Our approach offers significant performance gains over the competing methods in simulations, especially when the features themselves are correlated. In addition to comprehensive simulation studies, we apply our method to a high-dimensional breast cancer data set to identify markers significantly associated with the disease. Copyright (C) 2014 John Wiley & Sons, Ltd.
引用
收藏
页码:258 / 272
页数:15
相关论文
共 50 条
  • [41] New approach to Bayesian high-dimensional linear regression
    Jalali, Shirin
    Maleki, Arian
    INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2018, 7 (04) : 605 - 655
  • [42] Bayesian high-dimensional regression for change point analysis
    Datta, Abhirup
    Zou, Hui
    Banerjee, Sudipto
    STATISTICS AND ITS INTERFACE, 2019, 12 (02) : 253 - 264
  • [43] Nearly optimal Bayesian shrinkage for high-dimensional regression
    Song, Qifan
    Liang, Faming
    SCIENCE CHINA-MATHEMATICS, 2023, 66 (02) : 409 - 442
  • [44] Nearly optimal Bayesian shrinkage for high-dimensional regression
    Qifan Song
    Faming Liang
    ScienceChina(Mathematics), 2023, 66 (02) : 409 - 442
  • [45] Nearly optimal Bayesian shrinkage for high-dimensional regression
    Qifan Song
    Faming Liang
    Science China Mathematics, 2023, 66 : 409 - 442
  • [46] Adaptive Bayesian density regression for high-dimensional data
    Shen, Weining
    Ghosal, Subhashis
    BERNOULLI, 2016, 22 (01) : 396 - 420
  • [47] Analysis of Ensemble Feature Selection for Correlated High-Dimensional RNA-Seq Cancer Data
    Polewko-Klim, Aneta
    Rudnicki, Witold R.
    COMPUTATIONAL SCIENCE - ICCS 2020, PT III, 2020, 12139 : 525 - 538
  • [48] BayesSUR: An R Package for High-Dimensional Multivariate Bayesian Variable and Covariance Selection in Linear Regression
    Zhao, Zhi
    Banterle, Marco
    Bottolo, Leonardo
    Richardson, Sylvia
    Lewin, Alex
    Zucknick, Manuela
    JOURNAL OF STATISTICAL SOFTWARE, 2021, 100 (11): : 1 - 32
  • [49] Bayesian variable selection in clustering high-dimensional data
    Tadesse, MG
    Sha, N
    Vannucci, M
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2005, 100 (470) : 602 - 617
  • [50] Bayesian variable selection for high-dimensional rank data
    Cui, Can
    Singh, Susheela P.
    Staicu, Ana-Maria
    Reich, Brian J.
    ENVIRONMETRICS, 2021, 32 (07)