Knowledge discovery interestingness measures based on unexpectedness

被引:19
|
作者
Kontonasios, Kleanthis-Nikolaos [1 ]
Spyropoulou, Eirini [1 ]
De Bie, Tijl [1 ]
机构
[1] Univ Bristol, Intelligent Syst Lab, Bristol, Avon, England
基金
英国工程与自然科学研究理事会;
关键词
PATTERNS;
D O I
10.1002/widm.1063
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge discovery methods often discover a large number of patterns. Although this can be considered of interest, it certainly presents considerable challenges too. Indeed, this set of patterns often contains lots of uninteresting patterns that risk overwhelming the data miner. In addition, a single interesting pattern can be discovered in a multitude of tiny variations that for all practical purposes are redundant. These issues are referred to as the pattern explosion problem. They lie at the basis of much recent research attempting to quantify interestingness and redundancy between patterns, with the purpose of filtering down a large pattern set to an interesting and compact subset. Many diverse approaches to interestingness and corresponding interestingness measures (IMs) have been proposed in the literature. Some of them, named objective IMs, define interestingness only based on objective criteria of the pattern and data at hand. Subjective IMs additionally depend on the user's prior knowledge about the dataset. Formalizing unexpectedness is probably the most common approach for defining subjective IMs, where a pattern is deemed unexpected if it contradicts the user's expectations about the dataset. Such subjective IMs based on unexpectedness form the focus of this paper. We categorize measures based on unexpectedness into two major subgroups, namely, syntactical and probabilistic approaches. Based on this distinction, we survey different methods for assessing the unexpectedness of patterns with a special focus on frequent itemsets, tiles, association rules, and classification rules. (c) 2012 Wiley Periodicals, Inc.
引用
收藏
页码:386 / 399
页数:14
相关论文
共 50 条
  • [31] A Survey of Interestingness Measures for Association Rules
    Zhang, Yuejin
    Zhang, Lingling
    Nie, Guangli
    Shi, Yong
    2009 INTERNATIONAL CONFERENCE ON BUSINESS INTELLIGENCE AND FINANCIAL ENGINEERING, PROCEEDINGS, 2009, : 460 - 463
  • [32] Rate of change analysis for interestingness measures
    Sudarsanam, Nandan
    Kumar, Nishanth
    Sharma, Abhishek
    Ravindran, Balaraman
    KNOWLEDGE AND INFORMATION SYSTEMS, 2020, 62 (01) : 239 - 258
  • [33] On mining summaries by objective measures of interestingness
    Zbidi, N
    Faiz, S
    Limam, M
    MACHINE LEARNING, 2006, 62 (03) : 175 - 198
  • [34] A Novel Method of Interestingness Measures for Association Rules Mining Based on Profit
    Ju, Chunhua
    Bao, Fuguang
    Xu, Chonghuan
    Fu, Xiaokang
    DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2015, 2015
  • [35] Scenario-based analysis for discovering relations among interestingness measures
    Somyanonthanakul, Rachasak
    Theeramunkong, Thanaruk
    INFORMATION SCIENCES, 2022, 590 : 346 - 385
  • [36] Entailment and symmetry in confirmation measures of interestingness
    Glass, David H.
    INFORMATION SCIENCES, 2014, 279 : 552 - 559
  • [37] Interestingness measures of KDD: a comparative analysis
    Al-Hegami, AS
    CONCURRENT ENGINEERING: THE WORLDWIDE ENGINEERING GRID, PROCEEDINGS, 2004, : 321 - 327
  • [38] Properties of rule interestingness measures and alternative approaches to normalization of measures
    Greco, Salvatore
    Slowinski, Roman
    Szczech, Izabela
    INFORMATION SCIENCES, 2012, 216 : 1 - 16
  • [39] Interestingness Hotspot Discovery in Spatial Datasets Using a Graph-Based Approach
    Akdag, Fatih
    Eick, Christoph F.
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION (MLDM 2016), 2016, 9729 : 530 - 544
  • [40] Interestingness Measures for Association Rules within Groups
    Jimenez, Aida
    Berzal, Fernando
    Cubero, Juan-Carlos
    INFORMATION PROCESSING AND MANAGEMENT OF UNCERTAINTY IN KNOWLEDGE-BASED SYSTEMS: THEORY AND METHODS, PT 1, 2010, 80 : 298 - 307