Combining Multiple Observational Data Sources to Estimate Causal Effects

被引:39
|
作者
Yang, Shu [1 ]
Ding, Peng [2 ]
机构
[1] North Carolina State Univ, Dept Stat, 2311 Stinson Dr Campus Box 8203, Raleigh, NC 27695 USA
[2] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
关键词
Calibration; Causal inference; Inverse probability weighting; Missing confounder; Two-phase sampling; PROPENSITY SCORE CALIBRATION; DOUBLY ROBUST ESTIMATION; LARGE-SAMPLE PROPERTIES; AUXILIARY INFORMATION; MISSING CONFOUNDERS; MATCHING ESTIMATORS; VALIDATION DATA; REGRESSION; INFERENCE; 2-PHASE;
D O I
10.1080/01621459.2019.1609973
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The era of big data has witnessed an increasing availability of multiple data sources for statistical analyses. We consider estimation of causal effects combining big main data with unmeasured confounders and smaller validation data withon these confounders. Under the unconfoundedness assumption with completely observed confounders, the smaller validation data allow for constructing consistent estimators for causal effects, but the big main data can only give error-prone estimators in general. However, by leveraging the information in the big main data in a principled way, we can improve the estimation efficiencies yet preserve the consistencies of the initial estimators based solely on the validation data. Our framework applies to asymptotically normal estimators, including the commonly used regression imputation, weighting, and matching estimators, and does not require a correct specification of the model relating the unmeasured confounders to the observed variables. We also propose appropriate bootstrap procedures, which makes our method straightforward to implement using software routines for existing estimators.for this article are available online.
引用
收藏
页码:1540 / 1554
页数:15
相关论文
共 50 条
  • [1] Analyses of 'change scores' do not estimate causal effects in observational data
    Tennant, Peter W. G.
    Arnold, Kellyn F.
    Ellison, George T. H.
    Gilthorpe, Mark S.
    INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2022, 51 (05) : 1604 - 1615
  • [2] Estimating causal effects from observational data with a model for multiple bias
    Hoefler, Michael
    Lieb, Roselind
    Wittchen, Hans-Ulrich
    INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, 2007, 16 (02) : 77 - 87
  • [3] Borrowing from supplemental sources to estimate causal effects from a primary data source
    Boatman, Jeffrey A.
    Vock, David M.
    Koopmeiners, Joseph S.
    STATISTICS IN MEDICINE, 2021, 40 (24) : 5115 - 5130
  • [4] Combining observational and experimental data for causal inference considering data privacy
    Mann, Charlotte Z.
    Sales, Adam C.
    Gagnon-Bartsch, Johann A.
    JOURNAL OF CAUSAL INFERENCE, 2025, 13 (01)
  • [5] Combining multiple data sources for urban data acquisition
    Haala, N
    PHOTOGRAMMETRIC WEEK '99, 1999, : 329 - 339
  • [6] The estimation of causal effects from observational data
    Winship, C
    Morgan, SL
    ANNUAL REVIEW OF SOCIOLOGY, 1999, 25 : 659 - 706
  • [7] What would the trial be? Emulating randomized dietary intervention trials to estimate causal effects with observational data
    Tobias, Deirdre K.
    Lajous, Martin
    AMERICAN JOURNAL OF CLINICAL NUTRITION, 2021, 114 (02): : 416 - 417
  • [9] Using observational data to estimate treatment effects
    Stukel, Therese A.
    Fisher, Elliott S.
    Wennberg, David E.
    JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2007, 297 (19): : 2078 - 2079
  • [10] The Role of Sample Size to Attain Statistically Comparable Groups - A Required Data Preprocessing Step to Estimate Causal Effects With Observational Data
    Kolar, Ana
    Steiner, Peter M.
    EVALUATION REVIEW, 2021, : 166 - 190