Random Forests Approach for Causal Inference with Clustered Observational Data

被引:10
|
作者
Suk, Youmi [1 ]
Kang, Hyunseung [2 ]
Kim, Jee-Seon [1 ]
机构
[1] Univ Wisconsin Madison, Dept Educ Psychol, Madison, WI 53706 USA
[2] Univ Wisconsin Madison, Dept Stat, Madison, WI USA
关键词
Causal inference; machine learning methods; multilevel propensity score matching; multilevel observational data; hierarchical linear modeling; PROPENSITY SCORE ESTIMATION; SELECTION BIAS; STRATIFICATION; REGRESSION; IMPACT;
D O I
10.1080/00273171.2020.1808437
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
There is a growing interest in using machine learning (ML) methods for causal inference due to their (nearly) automatic and flexible ability to model key quantities such as the propensity score or the outcome model. Unfortunately, most ML methods for causal inference have been studied under single-level settings where all individuals are independent of each other and there is little work in using these methods with clustered or nested data, a common setting in education studies. This paper investigates using one particular ML method based on random forests known as Causal Forests to estimate treatment effects in multilevel observational data. We conduct simulation studies under different types of multilevel data, including two-level, three-level, and cross-classified data. Our simulation study shows that when the ML method is supplemented with estimated propensity scores from multilevel models that account for clustered/hierarchical structure, the modified ML method outperforms preexisting methods in a wide variety of settings. We conclude by estimating the effect of private math lessons in the Trends in International Mathematics and Science Study data, a large-scale educational assessment where students are nested within schools.
引用
收藏
页码:829 / 852
页数:24
相关论文
共 50 条
  • [1] Model-based inference on average causal effect in observational clustered data
    Meng Wu
    Recai M. Yucel
    Health Services and Outcomes Research Methodology, 2019, 19 : 36 - 60
  • [2] Model-based inference on average causal effect in observational clustered data
    Wu, Meng
    Yucel, Recai M.
    HEALTH SERVICES AND OUTCOMES RESEARCH METHODOLOGY, 2019, 19 (01) : 36 - 60
  • [3] Causal inference and observational data
    Ivan Olier
    Yiqiang Zhan
    Xiaoyu Liang
    Victor Volovici
    BMC Medical Research Methodology, 23
  • [4] Causal inference and observational data
    Olier, Ivan
    Zhan, Yiqiang
    Liang, Xiaoyu
    Volovici, Victor
    BMC MEDICAL RESEARCH METHODOLOGY, 2023, 23 (01)
  • [5] Causal inference with observational data
    Nichols, Austin
    STATA JOURNAL, 2007, 7 (04): : 507 - 541
  • [6] Causal inference from observational data
    Listl, Stefan
    Juerges, Hendrik
    Watt, Richard G.
    COMMUNITY DENTISTRY AND ORAL EPIDEMIOLOGY, 2016, 44 (05) : 409 - 415
  • [7] Propensity Score Weighting for Causal Inference with Clustered Data
    Yang, Shu
    JOURNAL OF CAUSAL INFERENCE, 2018, 6 (02)
  • [8] Causal inference with observational data in addiction research
    Chan, Gary C. K.
    Lim, Carmen
    Sun, Tianze
    Stjepanovic, Daniel
    Connor, Jason
    Hall, Wayne
    Leung, Janni
    ADDICTION, 2022, 117 (10) : 2736 - 2744
  • [9] Federated causal inference in heterogeneous observational data
    Xiong, Ruoxuan
    Koenecke, Allison
    Powell, Michael
    Shen, Zhu
    Vogelstein, Joshua T.
    Athey, Susan
    STATISTICS IN MEDICINE, 2023, 42 (24) : 4418 - 4439
  • [10] Causal Inference From Observational Data: It Is Complicated
    Shpitser, Ilya
    Kudchadkar, Sapna R.
    Fackler, James
    PEDIATRIC CRITICAL CARE MEDICINE, 2021, 22 (12) : 1093 - 1096