The Nu Expression for Probabilistic Data Integration

被引:0
|
作者
Evgenia I. Polyakova
Andre G. Journel
机构
[1] Stanford University,Department of Geological and Environmental Sciences
来源
Mathematical Geology | 2007年 / 39卷
关键词
Data integration; Data interaction vs. dependence; Updating probabilities; Conditional independence;
D O I
暂无
中图分类号
学科分类号
摘要
The general problem of data integration is expressed as that of combining probability distributions conditioned to each individual datum or data event into a posterior probability for the unknown conditioned jointly to all data. Any such combination of information requires taking into account data interaction for the specific event being assessed. The nu expression provides an exact analytical representation of such a combination. This representation allows a clear and useful separation of the two components of any data integration algorithm: individual data information content and data interaction, the latter being different from data dependence. Any estimation workflow that fails to address data interaction is not only suboptimal, but may result in severe bias. The nu expression reduces the possibly very complex joint data interaction to a single multiplicative correction parameter ν0, difficult to evaluate but whose exact analytical expression is given; availability of such an expression provides avenues for its determination or approximation. The case ν0=1 is more comprehensive than data conditional independence; it delivers a preliminary robust approximation in presence of actual data interaction. An experiment where the exact results are known allows the results of the ν0=1 approximation to be checked against the traditional estimators based on assumption of data independence.
引用
收藏
页码:715 / 733
页数:18
相关论文
共 50 条
  • [41] Probabilistic Look-ahead Contingency Analysis Integration with Commercial Tool and Practical Data
    Chen, Yousu
    Ren, Huiying
    Wu, Jun
    Hou, Zhangshuan
    IFAC PAPERSONLINE, 2020, 53 (02): : 13125 - 13130
  • [42] Probabilistic Count Matrix Factorization for Single Cell Expression Data Analysis
    Durif, G.
    Modolo, L.
    Mold, J. E.
    Lambert-Lacroix, S.
    Picard, F.
    RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, RECOMB 2018, 2018, 10812 : 254 - 255
  • [43] Probabilistic count matrix factorization for single cell expression data analysis
    Durif, Ghislain
    Modolo, Laurent
    Mold, Jeff E.
    Lambert-Lacroix, Sophie
    Picard, Franck
    BIOINFORMATICS, 2019, 35 (20) : 4011 - 4019
  • [44] Probabilistic lung cancer models conditioned on gene expression microarray data
    Friedman, C
    Cao, WB
    Fan, C
    METHODS OF MICROARRAY DATA ANALYSIS IV, 2005, : 133 - 146
  • [45] Analysis of called arrayCGH data: Clustering, testing and integration with gene expression data
    van de Wiel, Mark
    Ylstra, Bauke
    van Wieringen, Wessel
    CELLULAR ONCOLOGY, 2008, 30 (02) : 92 - 93
  • [46] BioGPS and GXD: mouse gene expression data—the benefits and challenges of data integration
    Martin Ringwald
    Chunlei Wu
    Andrew I. Su
    Mammalian Genome, 2012, 23 : 550 - 558
  • [47] NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling
    Lee, Junhyeok
    Han, Seungu
    INTERSPEECH 2021, 2021, : 1634 - 1638
  • [48] Converting probabilistic relational data to probabilistic XML data tree
    Wang J.
    Hao Z.
    Information Technology Journal, 2010, 9 (08) : 1706 - 1712
  • [49] Integration of biological networks and gene expression data using Cytoscape
    Cline, Melissa S.
    Smoot, Michael
    Cerami, Ethan
    Kuchinsky, Allan
    Landys, Nerius
    Workman, Chris
    Christmas, Rowan
    Avila-Campilo, Iliana
    Creech, Michael
    Gross, Benjamin
    Hanspers, Kristina
    Isserlin, Ruth
    Kelley, Ryan
    Killcoyne, Sarah
    Lotia, Samad
    Maere, Steven
    Morris, John
    Ono, Keiichiro
    Pavlovic, Vuk
    Pico, Alexander R.
    Vailaya, Aditya
    Wang, Peng-Liang
    Adler, Annette
    Conklin, Bruce R.
    Hood, Leroy
    Kuiper, Martin
    Sander, Chris
    Schmulevich, Ilya
    Schwikowski, Benno
    Warner, Guy J.
    Ideker, Trey
    Bader, Gary D.
    NATURE PROTOCOLS, 2007, 2 (10) : 2366 - 2382
  • [50] Comparative analysis of algorithms for integration of copy number and expression data
    Louhimo, Riku
    Lepikhova, Tatiana
    Monni, Outi
    Hautaniemi, Sampsa
    NATURE METHODS, 2012, 9 (04) : 351 - U50