An algorithm on mining approximate functional dependencies in probabilistic database

被引:0
|
作者
Miao, Dongjing [1 ]
Liu, Xianmin [1 ]
Li, Jianzhong [1 ]
机构
[1] School of Computer Science and Technology, Harbin Institute of Technology, Harbin,150001, China
关键词
D O I
10.7544/issn1000-1239.2015.20140685
中图分类号
学科分类号
摘要
An approximate functional dependency (AFD) is a functional dependency almost hold, and the most existing works are only able to mine AFDs from general data. Sometimes, data is stored in probabilistic database, in order to mine AFDs from such type of data, we define the probabilistic AFD, namely (λ, δ)-AFD which is different from the previous definition. We propose a dynamic programming to compute the confidence probability of a candidate AFD and check if the confidence probability is more than the probability threshold, however, as the high time complexity of dynamic programming, we derive the lower bound based on Chernoff bound to prune candidates as much as possible. Then, under help of the anti-monotone property, we propose a mining algorithm based on lexicographical order and some pruning criterions to speed up the mining process. At last, experiments are performed on the synthetic and the real-life data sets, and the results show the effectiveness of the pruning criterions and the scalability of our mining algorithm, and we show the interesting results mined from DBLP data set. © 2015, Science Press. All right reserved.
引用
收藏
页码:2857 / 2865
相关论文
共 50 条
  • [1] Resampling in an indefinite database to approximate functional dependencies
    Collopy, E
    Levene, M
    PRINCIPLES OF DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 1510 : 291 - 299
  • [2] DAFDISCOVER: Robust Mining Algorithm for Dynamic Approximate Functional Dependencies on Dirty Data
    Ding, Xiaoou
    Lu, Yixing
    Wang, Hongzhi
    Wang, Chen
    Liu, Yida
    Wang, Jianmin
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (11): : 3484 - 3496
  • [3] HLS: Tunable mining of approximate functional dependencies
    Engle, Jeremy T.
    Robertson, Edward L.
    SHARING DATA, INFORMATION AND KNOWLEDGE, PROCEEDINGS, 2008, 5071 : 28 - 39
  • [4] Database mining for the discovery of extended functional dependencies
    Bosc, P
    Pivert, O
    Ughetto, L
    18TH INTERNATIONAL CONFERENCE OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY - NAFIPS, 1999, : 580 - 584
  • [5] TANE:: An efficient algorithm for discovering functional and approximate dependencies
    Huhtala, Y
    Kärkkäinen, J
    Porkka, P
    Toivonen, H
    COMPUTER JOURNAL, 1999, 42 (02): : 100 - 111
  • [6] FUN: An efficient algorithm for mining functional and embedded dependencies
    Novelli, N
    Cicchetti, R
    DATABASE THEORY - ICDT 2001, PROCEEDINGS, 2001, 1973 : 189 - 203
  • [7] Mining Approximate Temporal Functional Dependencies Based on Pure Temporal Grouping
    Combi, Carlo
    Parise, Paolo
    Sala, Pietro
    Pozzi, Giuseppe
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2013, : 258 - 265
  • [8] APPROXIMATE DEPENDENCIES IN DATABASE-SYSTEMS
    SAHARIA, AN
    BARRON, TM
    DECISION SUPPORT SYSTEMS, 1995, 13 (3-4) : 335 - 347
  • [9] Mining approximate temporal functional dependencies with pure temporal grouping in clinical databases
    Combi, Carlo
    Mantovani, Matteo
    Sabaini, Alberto
    Sala, Pietro
    Amaddeo, Francesco
    Moretti, Ugo
    Pozzi, Giuseppe
    COMPUTERS IN BIOLOGY AND MEDICINE, 2015, 62 : 306 - 324
  • [10] Functional and approximate dependency mining: database and FCA points of view
    Lopes, S
    Petit, JM
    Lakhal, L
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2002, 14 (02) : 93 - 114