Causal Discovery in the Presence of Missing Data

被引:0
|
作者
Tu, Ruibo [1 ]
Zhang, Cheng [2 ]
Ackermann, Paul [3 ]
Mohan, Karthika [4 ]
Kjellstrom, Hedvig [1 ]
Zhang, Kun [5 ]
机构
[1] KTH Royal Inst Technol, Stockholm, Sweden
[2] Microsoft Res, Cambridge, England
[3] Karolinska Inst, Solna, Sweden
[4] Univ Calif Berkeley, Berkeley, CA 94720 USA
[5] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
来源
22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89 | 2019年 / 89卷
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
INFERENCE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Missing data are ubiquitous in many domains such as healthcare. When these data entries are not missing completely at random, the (conditional) independence relations in the observed data may be different from those in the complete data generated by the underlying causal process. Consequently, simply applying existing causal discovery methods to the observed data may lead to wrong conclusions. In this paper, we aim at developing a causal discovery method to recover the underlying causal structure from observed data that are missing under different mechanisms, including missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). With missingness mechanisms represented by missingness graphs (m-graphs), we analyze conditions under which additional correction is needed to derive conditional independence/dependence relations in the complete data. Based on our analysis, we propose Missing Value PC (MVPC), which extends the PC algorithm to incorporate additional corrections. Our proposed MVPC is shown in theory to give asymptotically correct results even on data that are MAR or MNAR. Experimental results on both synthetic data and real healthcare applications illustrate that the proposed algorithm is able to find correct causal relations even in the general case of MNAR.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] Applying Causal Discovery to Intensive Longitudinal Data
    Stevenson, Brittany L.
    Kummerfeld, Erich
    Merrill, Jennifer E.
    CAUSAL ANALYSIS WORKSHOP SERIES, VOL 160, 2021, 160 : 20 - +
  • [42] Aristotle: stratified causal discovery for omics data
    Mansouri, Mehrdad
    Khakabimamaghani, Sahand
    Chindelevitch, Leonid
    Ester, Martin
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [43] Causal discovery from medical textual data
    Mani, S
    Cooper, GF
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2000, : 542 - 546
  • [44] Causal discovery of gene regulation with incomplete data
    Foraita, Ronja
    Friemel, Juliane
    Guenther, Kathrin
    Behrens, Thomas
    Bullerdiek, Joern
    Nimzyk, Rolf
    Ahrens, Wolfgang
    Didelez, Vanessa
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2020, 183 (04) : 1747 - 1775
  • [45] Data-driven discovery of causal interactions
    Ma, Saisai
    Liu, Lin
    Li, Jiuyong
    Thuc Duy Le
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2019, 8 (03) : 285 - 297
  • [46] Causal Discovery from Heterogeneous/Nonstationary Data
    Huang, Biwei
    Zhang, Kun
    Zhang, Jiji
    Ramsey, Joseph
    Sanchez-Romero, Ruben
    Glymour, Clark
    Schoelkopf, Bernhard
    JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [47] Causal Discovery on Non-Euclidean Data
    Yang, Jing
    Xie, Kai
    An, Ning
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 2202 - 2211
  • [48] Aristotle: stratified causal discovery for omics data
    Mehrdad Mansouri
    Sahand Khakabimamaghani
    Leonid Chindelevitch
    Martin Ester
    BMC Bioinformatics, 23
  • [49] Causal discovery from heterogeneous/nonstationary data
    Huang, Biwei
    Zhang, Kun
    Zhang, Jiji
    Ramsey, Joseph
    Sanchez-Romero, Ruben
    Glymour, Clark
    Schölkopf, Bernhard
    Journal of Machine Learning Research, 2020, 21
  • [50] Data-driven discovery of causal interactions
    Saisai Ma
    Lin Liu
    Jiuyong Li
    Thuc Duy Le
    International Journal of Data Science and Analytics, 2019, 8 : 285 - 297