Measuring Mutual Information Between All Pairs of Variables in Subquadratic Complexity

被引:0
|
作者
Ferdosi, Mohsen [1 ]
Davoodi, Arash Gholami [1 ]
Mohimani, Hosein [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
基金
美国国家卫生研究院;
关键词
BAYESIAN NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Finding associations between pairs of variables in large datasets is crucial for various disciplines. The brute force method for solving this problem requires computing the mutual information between ((2)(N)) pairs. In this paper, we consider the problem of finding pairs of variables with high mutual information in sub-quadratic complexity. This problem is analogous to the nearest neighbor search, where the goal is to find pairs among N variables that are similar to each other. To solve this problem, we develop a new algorithm for finding associations based on constructing a decision tree that assigns a hash to each variable, in a way that for pairs with higher mutual information, the chance of having the same hash is higher. For any 1 <= lambda <= 2, we prove that in the case of binary data, we can reduce the number of necessary mutual information computations for finding all pairs satisfying I(X, Y) > 2 - lambda from O(N-2) to O(N-lambda), where I(X, Y) is the empirical mutual information between variables X and Y. Finally, we confirmed our theory by experiments on simulated and real data.
引用
收藏
页码:4399 / 4408
页数:10
相关论文
共 50 条
  • [31] Research on the correlation between the mutual information and Lempel-Ziv complexity of nonlinear time series
    Zhang Dian-Zhong
    ACTA PHYSICA SINICA, 2007, 56 (06) : 3152 - 3157
  • [32] Mutual information discloses relationship between hemodynamic variables in artificial heart-implanted dogs
    Osaka, M
    Yambe, T
    Saitoh, H
    Yoshizawa, M
    Itoh, T
    Nitta, S
    Kishida, H
    Hayakawa, H
    AMERICAN JOURNAL OF PHYSIOLOGY-HEART AND CIRCULATORY PHYSIOLOGY, 1998, 275 (04): : H1419 - H1433
  • [33] Flow complexity in open systems: interlacing complexity index based on mutual information
    Pozo, Jose M.
    Geers, Arjan J.
    Villa-Uriol, Maria-Cruz
    Frangi, Alejandro F.
    JOURNAL OF FLUID MECHANICS, 2017, 825 : 704 - 742
  • [34] Closing the complexity gap between FCFS mutual exclusion and mutual exclusion
    Robert Danek
    Wojciech Golab
    Distributed Computing, 2010, 23 : 87 - 111
  • [35] Closing the Complexity Gap between FCFS Mutual Exclusion and Mutual Exclusion
    Danek, Robert
    Golab, Wojciech
    DISTRIBUTED COMPUTING, PROCEEDINGS, 2008, 5218 : 93 - 108
  • [36] Closing the Complexity Gap Between Mutual Exclusion and FCFS Mutual Exclusion
    Danek, Robert
    Golab, Wojciech
    PODC'08: PROCEEDINGS OF THE 27TH ANNUAL ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, 2008, : 448 - 448
  • [37] Closing the complexity gap between FCFS mutual exclusion and mutual exclusion
    Danek, Robert
    Golab, Wojciech
    DISTRIBUTED COMPUTING, 2010, 23 (02) : 87 - 111
  • [38] Measuring Enterprise Mutual Information Based on the Helix Model
    Wang, Wei
    Huang, Xucheng
    Luo, Shougui
    JOURNAL OF ORGANIZATIONAL AND END USER COMPUTING, 2022, 34 (07) : 1 - 17
  • [39] A Mutual Information-based method to select informative pairs of variables in case-control genetic association studies to improve the power of detecting interaction between genetic variants
    Emily, Mathieu
    Friguet, Chloe
    JOURNAL OF THE SFDS, 2018, 159 (02): : 84 - 110
  • [40] A new estimate of mutual information based measure of dependence between two variables: properties and fast implementation
    Namita Jain
    C. A. Murthy
    International Journal of Machine Learning and Cybernetics, 2016, 7 : 857 - 875