Data-Driven Covariate Selection for Confounding Adjustment by Focusing on the Stability of the Effect Estimator

被引:4
|
作者
Loh, Wen Wei [1 ,2 ,4 ]
Ren, Dongning [3 ]
机构
[1] Emory Univ, Dept Quantitat Theory & Methods, Atlanta, GA USA
[2] Univ Ghent, Dept Data Anal, Ghent, Belgium
[3] Tilburg Univ, Dept Social Psychol, Tilburg, Netherlands
[4] Emory Univ, Dept Quantitat Theory & Methods, 36 Eagle Row, Atlanta, GA 30322 USA
关键词
causal inference; double selection; observational studies; propensity scores; strong ignorability; DOUBLY ROBUST ESTIMATION; PROPENSITY SCORE; CAUSAL INFERENCE; VARIABLE SELECTION; SENSITIVITY-ANALYSIS; MODEL-SELECTION; BIAS REDUCTION; REGRESSION; LASSO; MISSPECIFICATION;
D O I
10.1037/met0000564
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Valid inference of cause-and-effect relations in observational studies necessitates adjusting for common causes of the focal predictor (i.e., treatment) and the outcome. When such common causes, henceforth termed confounders, remain unadjusted for, they generate spurious correlations that lead to biased causal effect estimates. But routine adjustment for all available covariates, when only a subset are truly confounders, is known to yield potentially inefficient and unstable estimators. In this article, we introduce a data-driven confounder selection strategy that focuses on stable estimation of the treatment effect. The approach exploits the causal knowledge that after adjusting for confounders to eliminate all confounding biases, adding any remaining non-confounding covariates associated with only treatment or outcome, but not both, should not systematically change the effect estimator. The strategy proceeds in two steps. First, we prioritize covariates for adjustment by probing how strongly each covariate is associated with treatment and outcome. Next, we gauge the stability of the effect estimator by evaluating its trajectory adjusting for different covariate subsets. The smallest subset that yields a stable effect estimate is then selected. Thus, the strategy offers direct insight into the (in)sensitivity of the effect estimator to the chosen covariates for adjustment. The ability to correctly select confounders and yield valid causal inferences following data-driven covariate selection is evaluated empirically using extensive simulation studies. Furthermore, we compare the introduced method empirically with routine variable selection methods. Finally, we demonstrate the procedure using two publicly available real-world datasets. A step-by-step practical guide with user-friendly R functions is included.
引用
收藏
页码:947 / 966
页数:20
相关论文
共 50 条
  • [31] A Data-driven project categorization process for portfolio selection
    El Bok, Ghizlane
    Berrado, Abdelaziz
    JOURNAL OF MODELLING IN MANAGEMENT, 2022, 17 (02) : 764 - 787
  • [32] Data-Driven Parameter Selection and Modeling for Concrete Carbonation
    Duan, Kangkang
    Cao, Shuangyin
    MATERIALS, 2022, 15 (09)
  • [33] A DATA-DRIVEN MADM MODEL FOR PERSONNEL SELECTION AND IMPROVEMENT
    Chuang, Yen-Ching
    Hu, Shu-Kung
    Liou, James J. H.
    Tzeng, Gwo-Hshiung
    TECHNOLOGICAL AND ECONOMIC DEVELOPMENT OF ECONOMY, 2020, 26 (04) : 751 - 784
  • [34] Data-Driven Answer Selection in Community QA Systems
    Nie, Liqiang
    Wei, Xiaochi
    Zhang, Dongxiang
    Wang, Xiang
    Gao, Zhipeng
    Yang, Yi
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (06) : 1186 - 1198
  • [35] Data-Driven Approach for Imperfect Maintenance Model Selection
    Liu, Yu
    Huang, Hong-Zhong
    Zhang, Xiaoling
    ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM (RAMS), 2011 PROCEEDINGS, 2011,
  • [36] Data-Driven Bandwidth Selection for Nonstationary Semiparametric Models
    Sun, Yiguo
    Li, Qi
    JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2011, 29 (04) : 541 - 551
  • [37] Data-Driven Ranking and Selection Under Input Uncertainty
    Wu, Di
    Wang, Yuhao
    Zhou, Enlu
    OPERATIONS RESEARCH, 2024, 72 (02) : 781 - 795
  • [38] Data-Driven Regularization Parameter Selection in Dynamic MRI
    Hanhela, Matti
    Grohn, Olli
    Kettunen, Mikko
    Niinimaki, Kati
    Vauhkonen, Marko
    Kolehmainen, Ville
    JOURNAL OF IMAGING, 2021, 7 (02)
  • [39] Measurement Selection for Data-Driven Monitoring of Distribution Systems
    Ferdowsi, Mohsen
    Benigni, Andrea
    Monti, Antonello
    Ponci, Ferdinanda
    IEEE SYSTEMS JOURNAL, 2019, 13 (04): : 4260 - 4268
  • [40] Data-driven double-focusing resolution analyses for seismic imaging
    Fu, Li-Yun
    Tang, Cong
    Wei, Wei
    Du, Qizhen
    GEOPHYSICS, 2024, 89 (04) : S311 - S324