Using the EM algorithm for Bayesian variable selection in logistic regression models with related covariates

被引:9
|
作者
Koslovsky, M. D. [1 ]
Swartz, M. D. [1 ]
Leon-Novelo, L. [1 ]
Chan, W. [1 ]
Wilkinson, A. V. [2 ]
机构
[1] UTHealth, Dept Biostat, 1200 Pressler St, Houston, TX 77030 USA
[2] UTHealth, Dept Epidemiol, Austin, TX USA
关键词
Bayesian inference; binary outcomes; deterministic annealing; expectation-maximization; grouped covariates; heredity constraint; inheritance property; variable selection; 62F15; 62J12; 68U20; GROUP LASSO; YOUTH; SMOKING; BINARY;
D O I
10.1080/00949655.2017.1398255
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We develop a Bayesian variable selection method for logistic regression models that can simultaneously accommodate qualitative covariates and interaction terms under various heredity constraints. We use expectation-maximization variable selection (EMVS) with a deterministic annealing variant as the platform for our method, due to its proven flexibility and efficiency. We propose a variance adjustment of the priors for the coefficients of qualitative covariates, which controls false-positive rates, and a flexible parameterization for interaction terms, which accommodates user-specified heredity constraints. This method can handle all pairwise interaction terms as well as a subset of specific interactions. Using simulation, we show that this method selects associated covariates better than the grouped LASSO and the LASSO with heredity constraints in various exploratory research scenarios encountered in epidemiological studies. We apply our method to identify genetic and non-genetic risk factors associated with smoking experimentation in a cohort of Mexican-heritage adolescents.
引用
收藏
页码:575 / 596
页数:22
相关论文
共 50 条
  • [21] An adaptive MCMC method for Bayesian variable selection in logistic and accelerated failure time regression models
    Wan, Kitty Yuen Yi
    Griffin, Jim E.
    STATISTICS AND COMPUTING, 2021, 31 (01)
  • [22] Bayesian estimation of logistic regression with misclassified covariates and response
    Falley, Brandi N.
    Stamey, James D.
    Beaujean, A. Alexander
    JOURNAL OF APPLIED STATISTICS, 2018, 45 (10) : 1756 - 1769
  • [23] Variable selection in regression with compositional covariates
    Lin, Wei
    Shi, Pixu
    Feng, Rui
    Li, Hongzhe
    BIOMETRIKA, 2014, 101 (04) : 785 - 797
  • [24] Bayesian model selection for logistic regression models with random intercept
    Wagner, Helga
    Duller, Christine
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (05) : 1256 - 1274
  • [25] The EM algorithm for mixture regression with missing covariates
    Kim, Hyungmin
    Ham, Geonhee
    Seo, Byungtae
    KOREAN JOURNAL OF APPLIED STATISTICS, 2016, 29 (07) : 1347 - 1359
  • [26] Bayesian variable selection in logistic regression: Predicting company earnings direction
    Gerlach, R
    Bird, R
    Hall, A
    AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, 2002, 44 (02) : 155 - 168
  • [27] Bayesian structured variable selection in linear regression models
    Min Wang
    Xiaoqian Sun
    Tao Lu
    Computational Statistics, 2015, 30 : 205 - 229
  • [28] Bayesian structured variable selection in linear regression models
    Wang, Min
    Sun, Xiaoqian
    Lu, Tao
    COMPUTATIONAL STATISTICS, 2015, 30 (01) : 205 - 229
  • [29] Bayesian Variable Selection for Gaussian Copula Regression Models
    Alexopoulos, Angelos
    Bottolo, Leonardo
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2021, 30 (03) : 578 - 593
  • [30] Instrumental variable based SEE variable selection for Poisson regression models with endogenous covariates
    Huang, Jiting
    Zhao, Peixin
    Huang, Xingshou
    JOURNAL OF APPLIED MATHEMATICS AND COMPUTING, 2019, 59 (1-2) : 163 - 178