Indexing Correlated Probabilistic Databases

被引:0
|
作者
Kanagal, Bhargav [1 ]
Deshpande, Amol [1 ]
机构
[1] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
关键词
Probabilistic Databases; Indexing; Junction Trees; Caching; Inference queries;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With large amounts of correlated probabilistic data being generated in a wide range of application domains including sensor networks, information extraction, event detection etc., effectively managing and querying them has become an important research direction. While there is an exhaustive body of literature on querying independent probabilistic data, supporting efficient queries over large-scale, correlated databases remains a challenge. In this paper, we develop efficient data structures and indexes for supporting inference and decision support queries over such databases. Our proposed hierarchical data structure is suitable both for in-memory and disk-resident databases. We represent the correlations in the probabilistic database using a junction tree over the tuple-existence or attribute-value random variables, and use tree partitioning techniques to build an index structure over it. We show how to efficiently answer inference and aggregation queries using such an index, resulting in orders of magnitude performance benefits in most cases. In addition, we develop novel algorithms for efficiently keeping the index structure up-to-date as changes (inserts, updates) are made to the probabilistic database. We present a comprehensive experimental study illustrating the benefits of our approach to query processing in probabilistic databases.
引用
收藏
页码:455 / 468
页数:14
相关论文
共 50 条
  • [21] QUALITY OF INDEXING IN ONLINE DATABASES
    WHITE, HD
    GRIFFITH, BC
    INFORMATION PROCESSING & MANAGEMENT, 1987, 23 (03) : 211 - 224
  • [22] PROBABILISTIC APPROACH TO AUTOMATIC KEYWORD INDEXING .2. ALGORITHM FOR PROBABILISTIC INDEXING
    HARTER, SP
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1975, 26 (05): : 280 - 289
  • [23] Probabilistic latent semantic indexing
    Hofmann, T
    SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 1999, : 50 - 57
  • [24] MODELS FOR RETRIEVAL WITH PROBABILISTIC INDEXING
    FUHR, N
    INFORMATION PROCESSING & MANAGEMENT, 1989, 25 (01) : 55 - 72
  • [25] AN ALGEBRA FOR PROBABILISTIC DATABASES
    PITTARELLI, M
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1994, 6 (02) : 293 - 303
  • [26] Towards optimal indexing for segment databases
    Bertino, E
    Catania, B
    Shidlovsky, B
    ADVANCES IN DATABASE TECHNOLOGY - EDBT'98, 1998, 1377 : 39 - 53
  • [27] Conditioning Probabilistic Databases
    Koch, Christoph
    Olteanu, Dan
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (01): : 313 - 325
  • [28] Probabilistic semantic video indexing
    Naphade, MR
    Kozintsev, I
    Huang, T
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 13, 2001, 13 : 967 - 973
  • [29] Semistructured probabilistic databases
    Dekhtyar, A
    Goldsmith, J
    Hawkes, SR
    THIRTEENTH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, PROCEEDINGS, 2001, : 36 - 45
  • [30] Errors in DOI indexing by bibliometric databases
    Franceschini, Fiorenzo
    Maisano, Domenico
    Mastrogiacomo, Luca
    SCIENTOMETRICS, 2015, 102 (03) : 2181 - 2186