Mining significant association rules from uncertain data

被引:0
|
作者
Anshu Zhang
Wenzhong Shi
Geoffrey I. Webb
机构
[1] The Hong Kong Polytechnic University,Department of Land Surveying and Geo
[2] Monash University,Informatics
来源
关键词
Pattern discovery; Association rules; Statistical evaluation; Uncertain data;
D O I
暂无
中图分类号
学科分类号
摘要
In association rule mining, the trade-off between avoiding harmful spurious rules and preserving authentic ones is an ever critical barrier to obtaining reliable and useful results. The statistically sound technique for evaluating statistical significance of association rules is superior in preventing spurious rules, yet can also cause severe loss of true rules in presence of data error. This study presents a new and improved method for statistical test on association rules with uncertain erroneous data. An original mathematical model was established to describe data error propagation through computational procedures of the statistical test. Based on the error model, a scheme combining analytic and simulative processes was designed to correct the statistical test for distortions caused by data error. Experiments on both synthetic and real-world data show that the method significantly recovers the loss in true rules (reduces type-2 error) due to data error occurring in original statistically sound method. Meanwhile, the new method maintains effective control over the familywise error rate, which is the distinctive advantage of the original statistically sound technique. Furthermore, the method is robust against inaccurate data error probability information and situations not fulfilling the commonly accepted assumption on independent error probabilities of different data items. The method is particularly effective for rules which were most practically meaningful yet sensitive to data error. The method proves promising in enhancing values of association rule mining results and helping users make correct decisions.
引用
收藏
页码:928 / 963
页数:35
相关论文
共 50 条
  • [1] Mining significant association rules from uncertain data
    Zhang, Anshu
    Shi, Wenzhong
    Webb, Geoffrey I.
    DATA MINING AND KNOWLEDGE DISCOVERY, 2016, 30 (04) : 928 - 963
  • [2] Mining fuzzy association rules from uncertain data
    Cheng-Hsiung Weng
    Yen-Liang Chen
    Knowledge and Information Systems, 2010, 23 : 129 - 152
  • [3] Mining fuzzy association rules from uncertain data
    Weng, Cheng-Hsiung
    Chen, Yen-Liang
    KNOWLEDGE AND INFORMATION SYSTEMS, 2010, 23 (02) : 129 - 152
  • [4] Mining Probabilistic Association Rules from Uncertain Databases with Pruning
    Peterson, Erich A.
    Zhang, Liang
    Tang, Peiyi
    IEEE SOUTHEASTCON 2014, 2014,
  • [5] Mining association rules on significant rare data using relative support
    Yun, HY
    Ha, DS
    Hwang, BY
    Ryu, KH
    JOURNAL OF SYSTEMS AND SOFTWARE, 2003, 67 (03) : 181 - 191
  • [6] Mining association rules from quantitative data
    Hong, Tzung-Pei
    Kuo, Chan-Sheng
    Chi, Sheng-Chai
    Intelligent Data Analysis, 1999, 3 (05): : 363 - 376
  • [7] Mining significant association rules from educational data using critical relative support approach
    Abdullah, Zailani
    Herawan, Tutut
    Ahmad, Noraziah
    Deris, Mustafa Mat
    WORLD CONFERENCE ON EDUCATIONAL TECHNOLOGY RESEARCHES-2011, 2011, 28
  • [8] Mining association rules with uncertain item relationships
    Shyu, ML
    Haruechaiyasak, C
    Chen, SC
    Premaratne, K
    6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XVI, PROCEEDINGS: COMPUTER SCIENCE III, 2002, : 435 - 440
  • [9] Mining significant association rules (short version)
    Li, JY
    Shen, H
    Pritchard, P
    INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-V, PROCEEDINGS, 1999, : 1458 - 1461
  • [10] Mining association rules from structured xml data
    Faculty of Computer Science, Information Technology, University Putra Malaysia, Serdang, Selangor, 43400, Malaysia
    Proc. Int. Conf. Electr. Eng. Informatics, ICEEI, 1600, (376-379):