Variable selection for multiply-imputed data with penalized generalized estimating equations

被引:6
|
作者
Geronimi, J. [1 ,2 ]
Saporta, G. [2 ]
机构
[1] IRIS, 50 Rue Carnot, F-92284 Suresnes, France
[2] CNAM, Cedric, 292 Rue St Martin, F-75141 Paris, France
关键词
Generalized estimating equations; LASSO; Longitudinal data; Missing data; Multiple imputation; Variable selection; LONGITUDINAL DATA; MISSING DATA; IMPUTATION; REGRESSION; KNEE;
D O I
10.1016/j.csda.2017.01.001
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Generalized estimating equations (GEE) are useful tools for marginal regression analysis for longitudinal data. Having a high number of variables along with the presence of missing data presents complex issues when working in a longitudinal context. In variable selection for instance, penalized generalized estimating equations have not been systematically developed to integrate missing data. The MI-PGEE: multiple imputation penalized generalized estimating equations, an extension of the multiple imputation least absolute shrinkage and selection operator (MI-LASSO) is presented. MI-PGEE allows integration of missing data and within-subject correlation in variable selection procedures. Missing data are dealt with using multiple imputation, and variable selection is performed using a group LASSO penalty. Estimated coefficients for the same variable across multiply imputed datasets are considered as a group while applying penalized generalized estimating equations, leading to a unique model across multiply-imputed datasets. In order to select the tuning parameter, a new BIC-like criterion is proposed. In a simulation study, the advantage of using MI-PGEE compared to simple imputation PGEE is shown. The usefulness of the new method is illustrated by an application to a subgroup of the placebo arm of the strontium ranelate efficacy in knee osteoarthritis trial study. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:103 / 114
页数:12
相关论文
共 50 条
  • [31] A simple pooling method for variable selection in multiply imputed datasets outperformed complex methods
    A. M. Panken
    M. W. Heymans
    BMC Medical Research Methodology, 22
  • [32] Penalized estimating functions and variable selection in semiparametric regression models
    Johnson, Brent A.
    Lin, D. Y.
    Zeng, Donglin
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (482) : 672 - 680
  • [33] Variable Selection for Panel Count Data via Non-Concave Penalized Estimating Function
    Tong, Xingwei
    He, Xin
    Sun, Liuquan
    Sun, Jianguo
    SCANDINAVIAN JOURNAL OF STATISTICS, 2009, 36 (04) : 620 - 635
  • [34] Model selection in the weighted generalized estimating equations for longitudinal data with dropout
    Gosho, Masahiko
    BIOMETRICAL JOURNAL, 2016, 58 (03) : 570 - 587
  • [35] Correlates of Stress in the College Environment Uncovered by the Application of Penalized Generalized Estimating Equations to Mobile Sensing Data
    DaSilva, Alex W.
    Huckins, Jeremy F.
    Wang, Rui
    Wang, Weichen
    Wagner, Dylan D.
    Campbell, Andrew T.
    JMIR MHEALTH AND UHEALTH, 2019, 7 (03):
  • [36] Generalized estimating equations with model selection for comparing dependent categorical agreement data
    Tsai, Miao-Yu
    Wang, Jung-Feng
    Wu, Jia-Ling
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2011, 55 (07) : 2354 - 2362
  • [37] Efficient interaction selection for clustered data via stagewise generalized estimating equations
    Vaughan, Gregory
    Aseltine, Robert
    Chen, Kun
    Yan, Jun
    STATISTICS IN MEDICINE, 2020, 39 (22) : 2855 - 2868
  • [38] Multiply Robust Weighted Generalized Estimating Equations for Incomplete Longitudinal Binary Data Using Empirical Likelihood
    Komazaki, Hiroshi
    Doi, Masaaki
    Yonemoto, Naohiro
    Sato, Tosiya
    STATISTICS IN BIOPHARMACEUTICAL RESEARCH, 2024, 16 (01): : 116 - 129
  • [39] Modal regression statistical inference for longitudinal data semivarying coefficient models: Generalized estimating equations, empirical likelihood and variable selection
    Wang, Kangning
    Li, Shaomin
    Sun, Xiaofei
    Lin, Lu
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2019, 133 : 257 - 276
  • [40] Penalized factor mixture analysis for variable selection in clustered data
    Galimberti, Giuliano
    Montanari, Angela
    Viroli, Cinzia
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2009, 53 (12) : 4301 - 4310