Mining combined causes in large data sets

被引：13

作者：

Ma, Saisai ^{[1
]}

Li, Jiuyong ^{[1
]}

Liu, Lin ^{[1
]}

Thuc Duy Le ^{[1
]}

机构：

[1] Univ S Australia, Sch Informat Technol & Math Sci, Mawson Lakes, SA 5095, Australia

来源：

KNOWLEDGE-BASED SYSTEMS | 2016年 / 92卷

基金：

澳大利亚研究理事会;

关键词：

Causal discovery; Combined causes; Local causal discovery; HITON-PC; Multi-level HITON-PC; LEARNING BAYESIAN NETWORKS; ASSOCIATION; DISCOVERY; CAUSATION; MODELS;

D O I：

10.1016/j.knosys.2015.10.018

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, many methods have been developed for detecting causal relationships in observational data. Some of them have the potential to tackle large data sets. However, these methods fail to discover a combined cause, i.e. a multi-factor cause consisting of two or more component variables which individually are not causes. A straightforward approach to uncovering a combined cause is to include both individual and combined variables in the causal discovery using existing methods, but this scheme is computationally infeasible due to the huge number of combined variables. In this paper, we propose a novel approach to address this practical causal discovery problem, i.e. mining combined causes in large data sets. The experiments with both synthetic and real world data sets show that the proposed method can obtain high-quality causal discoveries with a high computational efficiency. (C) 2015 Elsevier B.V. All rights reserved.

引用

页码：104 / 111

页数：8

共 50 条

[21] Incremental meta-mining from large temporal data sets
Abraham, T
Roddick, JF
ADVANCES IN DATABASE TECHNOLOGIES, 1999, 1552 : 41 - 54
[22] A Generalized MapReduce Approach for Efficient mining of Large data Sets in the GRID
Roehm, Matthias
Grabert, Matthias
Schweiggert, Franz
PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, GRIDS, AND VIRTUALIZATION (CLOUD COMPUTING 2010), 2010, : 14 - 19
[23] PixelMaps: A new visual data mining approach for analyzing large spatial data sets
Keim, DA
Panse, C
Sips, M
North, SC
THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2003, : 565 - 568
[24] Visual data mining of large data sets using Vitamin-S system
Antoch, J
NEURAL NETWORK WORLD, 2005, 15 (04) : 283 - 293
[25] Parallel Distributed Genetic Rule Selection for Data Mining from Large Data Sets
Nojima, Yusuke
Mihara, Shingo
Ishibuchi, Hisao
SIMULATION AND MODELING RELATED TO COMPUTATIONAL SCIENCE AND ROBOTICS TECHNOLOGY, 2012, 37 : 140 - 154
[26] Data Mining on Imbalanced Data Sets
Gu, Qiong
Cai, Zhihua
Zhu, Li
Huang, Bo
2008 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING, 2008, : 1020 - 1024
[27] Data mining and metrics on data sets
Biebler, Karl-Ernst
Wodny, Michael
Jaeger, Bernd
INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MODELLING, CONTROL & AUTOMATION JOINTLY WITH INTERNATIONAL CONFERENCE ON INTELLIGENT AGENTS, WEB TECHNOLOGIES & INTERNET COMMERCE, VOL 1, PROCEEDINGS, 2006, : 638 - +
[28] Mining transformed data sets
Burns, A
Kusiak, A
Letsche, T
KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS, 2004, 3213 : 148 - 154
[29] P-AutoClass: Scalable parallel clustering for mining large data sets
Pizzuti, C
Talia, D
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003, 15 (03) : 629 - 641
[30] Visualization of large data sets using MDS combined with LVQ.
Naud, A
Duch, W
NEURAL NETWORKS AND SOFT COMPUTING, 2003, : 632 - 637

← 1 2 3 4 5 →