Ultra-high dimensional variable selection for doubly robust causal inference

被引:9
|
作者
Tang, Dingke [1 ]
Kong, Dehan [1 ]
Pan, Wenliang [2 ]
Wang, Linbo [1 ]
机构
[1] Univ Toronto, Dept Stat Sci, Toronto, ON M5S 3G3, Canada
[2] Sun Yat Sen Univ, Sch Math, Dept Stat Sci, Guangzhou, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Alzheimer's disease; average causal effect; ball covariance; confounder selection; variable screening; PROPENSITY SCORE; ALZHEIMERS-DISEASE; MODEL SELECTION; ADAPTIVE LASSO; EFFICIENT; TAU; BIOMARKERS;
D O I
10.1111/biom.13625
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Causal inference has been increasingly reliant on observational studies with rich covariate information. To build tractable causal procedures, such as the doubly robust estimators, it is imperative to first extract important features from high or even ultra-high dimensional data. In this paper, we propose causal ball screening for confounder selection from modern ultra-high dimensional data sets. Unlike the familiar task of variable selection for prediction modeling, our confounder selection procedure aims to control for confounding while improving efficiency in the resulting causal effect estimate. Previous empirical and theoretical studies suggest excluding causes of the treatment that are not confounders. Motivated by these results, our goal is to keep all the predictors of the outcome in both the propensity score and outcome regression models. A distinctive feature of our proposal is that we use an outcome model-free procedure for propensity score model selection, thereby maintaining double robustness in the resulting causal effect estimator. Our theoretical analyses show that the proposed procedure enjoys a number of properties, including model selection consistency and pointwise normality. Synthetic and real data analysis show that our proposal performs favorably with existing methods in a range of realistic settings. Data used in preparation of this paper were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database.
引用
收藏
页码:903 / 914
页数:12
相关论文
共 50 条
  • [21] Ultra-high dimensional variable selection with application to normative aging study: DNA methylation and metabolic syndrome
    Yoon, Grace
    Zheng, Yinan
    Zhang, Zhou
    Zhang, Haixiang
    Gao, Tao
    Joyce, Brian
    Zhang, Wei
    Guan, Weihua
    Baccarelli, Andrea A.
    Jiang, Wenxin
    Schwartz, Joel
    Vokonas, Pantel S.
    Hou, Lifang
    Liu, Lei
    BMC BIOINFORMATICS, 2017, 18
  • [22] Doubly Robust Augmented Model Accuracy Transfer Inference with High Dimensional Features
    Zhou, Doudou
    Liu, Molei
    Li, Mengyan
    Cai, Tianxi
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024,
  • [23] Causal Agnosticism About Race: Variable Selection Problems in Causal Inference
    Tolbert, Alexander Williams
    PHILOSOPHY OF SCIENCE, 2024, 91 (05) : 1098 - 1108
  • [24] Practically effective adjustment variable selection in causal inference
    Noda, Atsushi
    Isozaki, Takashi
    JOURNAL OF PHYSICS-COMPLEXITY, 2025, 6 (01):
  • [25] Causal inference for unbalanced cases in incomplete data with doubly robust estimators
    Yu, Haiyan
    Xiang, Jiao
    Gao, Mingyue
    Xitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice, 2022, 42 (01): : 211 - 223
  • [26] Grouped variable screening for ultra-high dimensional data for linear model
    Qiu, Debin
    Ahn, Jeongyoun
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 144
  • [27] Ultra-high dimensional variable screening via Gram–Schmidt orthogonalization
    Huiwen Wang
    Ruiping Liu
    Shanshan Wang
    Zhichao Wang
    Gilbert Saporta
    Computational Statistics, 2020, 35 : 1153 - 1170
  • [28] Variable selection for ultra-high-dimensional logistic models
    Du, Pang
    Wu, Pan
    Liang, Hua
    PERSPECTIVES ON BIG DATA ANALYSIS: METHODOLOGIES AND APPLICATIONS, 2014, 622 : 141 - 158
  • [29] Doubly robust estimation and causal inference in longitudinal studies with dropout and truncation by death
    Shardell, Michelle
    Hicks, Gregory E.
    Ferrucci, Luigi
    BIOSTATISTICS, 2015, 16 (01) : 155 - 168
  • [30] Model misspecification and robustness in causal inference: comparing matching with doubly robust estimation
    Waernbaum, Ingeborg
    STATISTICS IN MEDICINE, 2012, 31 (15) : 1572 - 1581