An algorithm on mining approximate functional dependencies in probabilistic database

被引:0
|
作者
Miao, Dongjing [1 ]
Liu, Xianmin [1 ]
Li, Jianzhong [1 ]
机构
[1] School of Computer Science and Technology, Harbin Institute of Technology, Harbin,150001, China
关键词
D O I
10.7544/issn1000-1239.2015.20140685
中图分类号
学科分类号
摘要
An approximate functional dependency (AFD) is a functional dependency almost hold, and the most existing works are only able to mine AFDs from general data. Sometimes, data is stored in probabilistic database, in order to mine AFDs from such type of data, we define the probabilistic AFD, namely (λ, δ)-AFD which is different from the previous definition. We propose a dynamic programming to compute the confidence probability of a candidate AFD and check if the confidence probability is more than the probability threshold, however, as the high time complexity of dynamic programming, we derive the lower bound based on Chernoff bound to prune candidates as much as possible. Then, under help of the anti-monotone property, we propose a mining algorithm based on lexicographical order and some pruning criterions to speed up the mining process. At last, experiments are performed on the synthetic and the real-life data sets, and the results show the effectiveness of the pruning criterions and the scalability of our mining algorithm, and we show the interesting results mined from DBLP data set. © 2015, Science Press. All right reserved.
引用
收藏
页码:2857 / 2865
相关论文
共 50 条
  • [41] Discovering Approximate Functional Dependencies using Smoothed Mutual Information
    Pennerath, Frederic
    Mandros, Panagiotis
    Vreeken, Jilles
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 1254 - 1264
  • [42] Improving the ε-approximate algorithm for Probabilistic Classifier Chains
    Fdez-Diaz, Miriam
    Fdez-Diaz, Laura
    Mena, Deiner
    Montanes, Elena
    Ramon Quevedo, Jose
    Jose del Coz, Juan
    KNOWLEDGE AND INFORMATION SYSTEMS, 2020, 62 (07) : 2709 - 2738
  • [43] FOX: Inference of approximate functional dependencies from XML data
    Fassetti, Fabio
    Fazzinga, Bettina
    DEXA 2007: 18TH INTERNATIONAL CONFERENCE ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2007, : 10 - +
  • [44] A New Probabilistic Algorithm for Approximate Model Counting
    Ge, Cunjing
    Ma, Feifei
    Liu, Tian
    Zhang, Jian
    Ma, Xutong
    AUTOMATED REASONING, IJCAR 2018, 2018, 10900 : 312 - 328
  • [45] Discovery and Application of Functional Dependencies in Conjunctive Query Mining
    Goethals, Bart
    Laurent, Dominique
    Le Page, Wim
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, 2010, 6263 : 142 - +
  • [46] A probabilistic algorithm for mining frequent sequences
    Tumasonis, R
    Dzemyda, G
    ADBIS' 04: EIGHTH EAST-EUROPEAN CONFERENCE ON ADVANCES IN DATABASES AND INFORMATION SYSTEMS, PROCEEDINGS, 2004, : 89 - 98
  • [47] Mining fuzzy functional dependencies from quantitative data
    Wang, SL
    Shen, JW
    Hong, TP
    SMC 2000 CONFERENCE PROCEEDINGS: 2000 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN & CYBERNETICS, VOL 1-5, 2000, : 3600 - 3605
  • [48] Mining web functional dependencies for flexible information access
    Perugini, Saverio
    Ramakrishnan, Naren
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2007, 58 (12): : 1805 - 1819
  • [49] Approximate sequential patterns for incomplete sequence database mining
    Fiot, Celine
    Laurent, Anne
    Teisseire, Maguelonne
    2007 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-4, 2007, : 663 - 668
  • [50] Aspects of approximate reasoning applied to unsupervised database mining
    Mazlack, LJ
    1996 BIENNIAL CONFERENCE OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY - NAFIPS, 1996, : 268 - 272