Causal Discovery in the Presence of Missing Data

被引:0
|
作者
Tu, Ruibo [1 ]
Zhang, Cheng [2 ]
Ackermann, Paul [3 ]
Mohan, Karthika [4 ]
Kjellstrom, Hedvig [1 ]
Zhang, Kun [5 ]
机构
[1] KTH Royal Inst Technol, Stockholm, Sweden
[2] Microsoft Res, Cambridge, England
[3] Karolinska Inst, Solna, Sweden
[4] Univ Calif Berkeley, Berkeley, CA 94720 USA
[5] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
来源
22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89 | 2019年 / 89卷
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
INFERENCE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Missing data are ubiquitous in many domains such as healthcare. When these data entries are not missing completely at random, the (conditional) independence relations in the observed data may be different from those in the complete data generated by the underlying causal process. Consequently, simply applying existing causal discovery methods to the observed data may lead to wrong conclusions. In this paper, we aim at developing a causal discovery method to recover the underlying causal structure from observed data that are missing under different mechanisms, including missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). With missingness mechanisms represented by missingness graphs (m-graphs), we analyze conditions under which additional correction is needed to derive conditional independence/dependence relations in the complete data. Based on our analysis, we propose Missing Value PC (MVPC), which extends the PC algorithm to incorporate additional corrections. Our proposed MVPC is shown in theory to give asymptotically correct results even on data that are MAR or MNAR. Experimental results on both synthetic data and real healthcare applications illustrate that the proposed algorithm is able to find correct causal relations even in the general case of MNAR.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Haplotype analysis in the presence of missing data
    Liu, N
    Zhao, H
    GENETIC EPIDEMIOLOGY, 2005, 29 (03) : 265 - 265
  • [32] Response shift in the presence of missing data
    D. L. Fairclough
    Quality of Life Research, 2015, 24 : 565 - 566
  • [33] Assessing Fairness in the Presence of Missing Data
    Zhang, Yiliang
    Long, Qi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [34] Response shift in the presence of missing data
    Fairclough, D. L.
    QUALITY OF LIFE RESEARCH, 2015, 24 (03) : 565 - 566
  • [35] ARCH MODELING IN THE PRESENCE OF MISSING DATA
    Bondon, Pascal
    2013 ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2013, : 39 - 43
  • [36] Prospective prediction in the presence of missing data
    Marshall, G
    Warner, B
    MaWhinney, S
    Hammermeister, K
    STATISTICS IN MEDICINE, 2002, 21 (04) : 561 - 570
  • [37] Doubly robust estimation of the causal effects in the causal inference with missing outcome data
    Han F.
    Journal of Ambient Intelligence and Humanized Computing, 2024, 15 (Suppl 1) : 11 - 11
  • [38] CDRM: Causal disentangled representation learning for missing data
    Chen, Mingjie
    Wang, Hongcheng
    Wang, Ruxin
    Peng, Yuzhong
    Zhang, Hao
    KNOWLEDGE-BASED SYSTEMS, 2024, 299
  • [39] Missing data estimation in fMRI dynamic causal modeling
    Zaghlool, Shaza B.
    Wyatt, Christopher L.
    FRONTIERS IN NEUROSCIENCE, 2014, 8
  • [40] Identifiability and estimation of causal mediation effects with missing data
    Li, Wei
    Zhou, Xiao-Hua
    STATISTICS IN MEDICINE, 2017, 36 (25) : 3948 - 3965