Visual Bayesian Fusion to Navigate a Data Lake

被引:0
|
作者
Singh, Karamjit [1 ]
Paneri, Kaushal [1 ]
Pandey, Aditeya [1 ]
Gupta, Garima [1 ]
Sharma, Geetika [1 ]
Agarwal, Puneet [1 ]
Shroff, Gautam [1 ]
机构
[1] Tata Consultancy Serv Ltd, TCS Res, Gurgaon, India
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The evolution from traditional business intelligence to big data analytics has witnessed the emergence of 'Data Lakes' in which data is ingested in raw form rather than into traditional data warehouses. With the increasing availability of many more pieces of information about each entity of interest, e.g., a customer, often from diverse sources (socialmedia, mobility, internet-of-things), fusing, visualizing and deriving insights from such data pose a number of challenges: First, disparate datasets often lack a natural join key. Next, datasets may describe measures at different levels of granularity, e.g., individual vs. aggregate data, and finally, different datasets may be derived from physically distinct populations. Moreover, once data has been fused, queries are often an inefficient and inaccurate mechanism to derive insight from high-dimensional data. In this paper we describe iFuse, a data-fusion based visual analytics platform for navigating a data lake to derive insights. We rely on Bayesian graphical models to provide useful rudder with which to fuse and analyze disparate islands of data in a systematic manner. Our platform allows for rich interactive visualizations, querying and keyword-based search within and across datasets or models, as well as intuitive visual interfaces for value-imputation or model-based predictions. We illustrate the use of our platform in multiple scenarios, including two public data challenges as well as a real-life industry use-case involving the probabilistic fusion of datasets that lack a natural join-key.
引用
收藏
页码:987 / 994
页数:8
相关论文
共 50 条
  • [41] Multispectral image data fusion under a Bayesian approach
    Mascarenhas, NDA
    Banon, GJF
    Candeias, ALB
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 1996, 17 (08) : 1457 - 1471
  • [42] Design of an adaptive bayesian system for sensor data fusion
    De Paola, Alessandra
    Gagliano, Luca
    Advances in Intelligent Systems and Computing, 2014, 260 : 61 - 76
  • [43] Bayesian model for EEG/MEG and fMRI data fusion
    Trujillo-Barreto, NJ
    Martínez-Montes, E
    Valdés-Sosa, PA
    Melie-García, L
    NEUROIMAGE, 2001, 13 (06) : S270 - S270
  • [44] Bayesian tomography and integrated data analysis in fusion diagnostics
    Li, Dong
    Dong, Y. B.
    Deng, Wei
    Shi, Z. B.
    Fu, B. Z.
    Gao, J. M.
    Wang, T. B.
    Zhou, Yan
    Liu, Yi
    Yang, Q. W.
    Duan, X. R.
    REVIEW OF SCIENTIFIC INSTRUMENTS, 2016, 87 (11):
  • [45] Factor Graphs for Heterogeneous Bayesian Decentralized Data Fusion
    Dagan, Ofer
    Ahmed, Nisar R.
    2021 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2021, : 254 - 261
  • [46] Bayesian source separation and system data fusion methodology
    Bierbaum, MM
    Fry, RL
    BAYESIAN INFERENCE AND MAXIMUM ENTROPY METHODS IN SCIENCE AND ENGINEERING, 2002, 617 : 109 - 124
  • [47] Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion
    Gebru, Israel D.
    Ba, Sileye
    Li, Xiaofei
    Horaud, Radu
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (05) : 1086 - 1099
  • [48] Visual programming environment for multisensor data fusion
    Hall, DL
    Kasmala, G
    DIGITIZATION OF THE BATTLEFIELD, 1996, 2764 : 181 - 187
  • [49] A purely Bayesian approach for proportional visual data modelling
    Bourouis, Sami
    Laalaoui, Yacine
    Bouguila, Nizar
    INTERNATIONAL JOURNAL OF INTELLIGENT ENGINEERING INFORMATICS, 2018, 6 (05) : 491 - 508
  • [50] Data Fusion for a Degraded Visual Environment Solution
    Baird, Noah
    Crisafulli, Michael
    DEGRADED VISUAL ENVIRONMENTS: ENHANCED, SYNTHETIC, AND EXTERNAL VISION SOLUTIONS 2015, 2015, 9471