FLAME: A fast large-scale almost matching exactly approach to causal inference

被引:0
|
作者
Wang, Tianyu [1 ]
Morucci, Marco [1 ]
Awan, M. Usaid [1 ]
Liu, Yameng [1 ]
Roy, Sudeepa [1 ]
Rudin, Cynthia [1 ]
Volfovsky, Alexander [1 ]
机构
[1] Duke University, United States
关键词
Categorical datasets - Causal inferences - Classical problems - Conditional average - Database management - Distance metrics - State-of-the-art methods - Training data sets;
D O I
暂无
中图分类号
学科分类号
摘要
A classical problem in causal inference is that of matching, where treatment units need to be matched to control units based on covariate information. In this work, we propose a method that computes high quality almost-exact matches for high-dimensional categorical datasets. This method, called FLAME (Fast Large-scale Almost Matching Exactly), learns a distance metric for matching using a hold-out training data set. In order to perform matching efficiently for large datasets, FLAME leverages techniques that are natural for query processing in the area of database management, and two implementations of FLAME are provided: The first uses SQL queries and the second uses bit-vector techniques. The algorithm starts by constructing matches of the highest quality (exact matches on all covariates), and successively eliminates variables in order to match exactly on as many variables as possible, while still maintaining interpretable high-quality matches and balance between treatment and control groups. We leverage these high quality matches to estimate conditional average treatment effects (CATEs). Our experiments show that FLAME scales to huge datasets with millions of observations where existing state-of-the-art methods fail, and that it achieves significantly better performance than other matching methods. © 2021 Tianyu Wang, Marco Morucci, M. Usaid Awan, Yameng Liu, Sudeepa Roy, Cynthia Rudin, Alexander Volfovsky. © 2021 Microtome Publishing. All rights reserved.
引用
收藏
相关论文
共 50 条
  • [31] Fast sparsity adaptive matching pursuit algorithm for large-scale image reconstruction
    Shihong Yao
    Qingfeng Guan
    Sheng Wang
    Xiao Xie
    EURASIP Journal on Wireless Communications and Networking, 2018
  • [32] A Fast Approach of Large-Scale IP Traffic Matrix Estimation
    Jiang, Dingde
    Chen, Jun
    He, Linbo
    Hu, Guangmin
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 1913 - +
  • [33] FRANK: A Fast Node Ranking Approach in Large-Scale Networks
    Zhang, Yu
    Gu, Lin
    Liao, Xiaofei
    Jin, Hai
    Zeng, Deze
    Zhou, Bing Bing
    IEEE NETWORK, 2017, 31 (01): : 36 - 43
  • [34] Causal inference and large-scale expert validation shed light on the drivers of SDM accuracy and variance
    Boyd, Robin J.
    Harvey, Martin
    Roy, David B.
    Barber, Tony
    Haysom, Karen A.
    Macadam, Craig R.
    Morris, Roger K. A.
    Palmer, Carolyn
    Palmer, Stephen
    Preston, Chris D.
    Taylor, Pam
    Ward, Robert
    Ball, Stuart G.
    Pescott, Oliver L.
    DIVERSITY AND DISTRIBUTIONS, 2023, 29 (06) : 774 - 784
  • [35] Efficient Large-Scale Stereo Matching
    Geiger, Andreas
    Roser, Martin
    Urtasun, Raquel
    COMPUTER VISION-ACCV 2010, PT I, 2011, 6492 : 25 - +
  • [36] Large-Scale Collective Entity Matching
    Rastogi, Vibhor
    Dalvi, Nilesh
    Garofalakis, Minos
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 4 (04): : 208 - 218
  • [37] Scalable Algorithms for Bayesian Inference of Large-Scale Models from Large-Scale Data
    Ghattas, Omar
    Isaac, Tobin
    Petra, Noemi
    Stadler, Georg
    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2016, 2017, 10150 : 3 - 6
  • [38] An efficient algorithm for large-scale causal discovery
    Hong, Yinghan
    Liu, Zhusong
    Mai, Guizhen
    SOFT COMPUTING, 2017, 21 (24) : 7381 - 7391
  • [39] An efficient algorithm for large-scale causal discovery
    Yinghan Hong
    Zhusong Liu
    Guizhen Mai
    Soft Computing, 2017, 21 : 7381 - 7391
  • [40] Large-scale discretization and generalized mode-matching as a basis for fast electromagnetic solvers
    Kirilenko, A
    MATHEMATICAL METHODS IN ELECTROMAGNETIC THEORY, CONFERENCE PROCEEDINGS, VOLS 1 AND 2, 2002, : 99 - 99