Causal discovery based on heterogeneous non-Euclidean data

被引:0
|
作者
Wang X. [1 ]
Li S. [1 ]
Wang Y. [2 ]
Wan Y. [1 ]
机构
[1] School of Economics and Management, Beijing University of Posts and Telecommunications, Beijing
[2] School of Banking and Finance, University of International Business and Economics, Beijing
基金
中国国家自然科学基金;
关键词
canonical correlation analysis; causal discovery; compositional data; functional data; industrial fault diagnosis;
D O I
10.12011/SETP2023-1202
中图分类号
学科分类号
摘要
Causal relationships play an irreplaceable role in revealing the mechanisms of phenomena and guiding intervention actions. However, due to limitations in existing frameworks regarding model representations and learning algorithms, only a few studies have explored causal discovery on non-Euclidean data. In this paper, we address the issue by proposing a causal mapping process based on coordinate representations for heterogeneous non-Euclidean data. We propose a data generation mechanism between the parent nodes and the child nodes and create a causal mechanism based on multi-dimensional tensor regression. Furthermore, within the aforementioned theoretical framework, we propose a two-stage causal discovery approach based on regularized generalized canonical correlation analysis. Using the discrete representation in the shared projection direction, causal relationships between heterogeneous non-Euclidean variables can be discovered more accurately. Finally, empirical research is conducted on real-world industrial sensor data, which demonstrates the effectiveness of the proposed method for discovering causal relationships in heterogeneous non-Euclidean data. © 2024 Systems Engineering Society of China. All rights reserved.
引用
收藏
页码:1987 / 2002
页数:15
相关论文
共 31 条
  • [1] Li J N, Xiong R B, Lan Y Y, Et al., A review of frontier advances in causal machine learning[J], Computer Research and Development, 60, 1, pp. 59-84, (2023)
  • [2] Cai R C, Chen W, Zhang K, Et al., Causal relationship discovery based on non-temporal observational data: A SURvey[J], Journal of Computer Research and Development, 40, 6, pp. 1470-1490, (2017)
  • [3] Wang H, Lu S, Zhao J., Aggregating multiple types of complex data in stock market prediction: A model-independent framework[J], Knowledge-based Systems, 164, pp. 193-204, (2019)
  • [4] Yang J, Xie K, An N., Causal discovery on non-Euclidean data[C], Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 2202-2211, (2022)
  • [5] Bruno F, Cocchi D, Greco F., Clustering compositional data trajectories: The case of particulate matter in the lower troposphere[J], Environmetrics, 22, 8, pp. 975-984, (2015)
  • [6] Tao Z F, Tan W F, Chen H Y., A component data time series prediction method integrating fuzzy time series analysis[J], Systems Engineering — Theory & Practice, 43, 5, pp. 1534-1544, (2023)
  • [7] Tu Y D, Wang S W., Functional kernel weighted estimation method and its application in economics[J], Systems Engineering — Theory & Practice, 39, 4, pp. 839-853, (2019)
  • [8] Bareinboim E, Correa J D, Ibeling D, Et al., On Pearl’s hierarchy and the foundations of causal inference[C], Probabilistic and Causal Inference: The Works of Judea Pearl, pp. 507-556, (2020)
  • [9] Zablocki E, Ben-Younes H, Perez P, Et al., Explainability of deep vision-based autonomous driving systems: Review and challenges[J], International Journal of Computer Vision, 130, 10, pp. 2425-2452, (2022)
  • [10] Wang Y, Liang D, Charlin L, Et al., Causal inference for recommender systems[C], Proceedings of the 14th ACM Conference on Recommender Systems, pp. 426-431, (2020)