Using the EM algorithm for Bayesian variable selection in logistic regression models with related covariates

被引:9
|
作者
Koslovsky, M. D. [1 ]
Swartz, M. D. [1 ]
Leon-Novelo, L. [1 ]
Chan, W. [1 ]
Wilkinson, A. V. [2 ]
机构
[1] UTHealth, Dept Biostat, 1200 Pressler St, Houston, TX 77030 USA
[2] UTHealth, Dept Epidemiol, Austin, TX USA
关键词
Bayesian inference; binary outcomes; deterministic annealing; expectation-maximization; grouped covariates; heredity constraint; inheritance property; variable selection; 62F15; 62J12; 68U20; GROUP LASSO; YOUTH; SMOKING; BINARY;
D O I
10.1080/00949655.2017.1398255
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We develop a Bayesian variable selection method for logistic regression models that can simultaneously accommodate qualitative covariates and interaction terms under various heredity constraints. We use expectation-maximization variable selection (EMVS) with a deterministic annealing variant as the platform for our method, due to its proven flexibility and efficiency. We propose a variance adjustment of the priors for the coefficients of qualitative covariates, which controls false-positive rates, and a flexible parameterization for interaction terms, which accommodates user-specified heredity constraints. This method can handle all pairwise interaction terms as well as a subset of specific interactions. Using simulation, we show that this method selects associated covariates better than the grouped LASSO and the LASSO with heredity constraints in various exploratory research scenarios encountered in epidemiological studies. We apply our method to identify genetic and non-genetic risk factors associated with smoking experimentation in a cohort of Mexican-heritage adolescents.
引用
收藏
页码:575 / 596
页数:22
相关论文
共 50 条
  • [41] A novel Bayesian approach for variable selection in linear regression models
    Posch, Konstantin
    Arbeiter, Maximilian
    Pilz, Juergen
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 144
  • [42] Variable selection for sparse logistic regression
    Zanhua Yin
    Metrika, 2020, 83 : 821 - 836
  • [43] Variable Selection in Logistic Regression Model
    Zhang Shangli
    Zhang Lili
    Qiu Kuanmin
    Lu Ying
    Cai Baigen
    CHINESE JOURNAL OF ELECTRONICS, 2015, 24 (04) : 813 - 817
  • [44] Variable Selection in Logistic Regression Model
    ZHANG Shangli
    ZHANG Lili
    QIU Kuanmin
    LU Ying
    CAI Baigen
    ChineseJournalofElectronics, 2015, 24 (04) : 813 - 817
  • [45] Variable selection for sparse logistic regression
    Yin, Zanhua
    METRIKA, 2020, 83 (07) : 821 - 836
  • [46] A variable selection method based on Tabu search for logistic regression models
    Pacheco, Joaquin
    Casado, Silvia
    Nunez, Laura
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2009, 199 (02) : 506 - 511
  • [47] Robust Variable and Interaction Selection for Logistic Regression and General Index Models
    Li, Yang
    Liu, Jun S.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2019, 114 (525) : 271 - 286
  • [48] Variable Selection Using Bayesian Additive Regression Trees
    Luo, Chuji
    Daniels, Michael J.
    STATISTICAL SCIENCE, 2024, 39 (02) : 286 - 304
  • [49] Boosting Variable Selection Algorithm for Linear Regression Models
    Zhang, Chun-Xia
    Wang, Guan-Wei
    2014 10TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION (ICNC), 2014, : 769 - 774
  • [50] Cancer classification and prediction using logistic regression with Bayesian gene selection
    Zhou, XB
    Liu, KY
    Wong, STC
    JOURNAL OF BIOMEDICAL INFORMATICS, 2004, 37 (04) : 249 - 259