Frequent itemset discovery with SQL using universal quantification

被引:0
|
作者
Rantzau, R [1 ]
机构
[1] Univ Stuttgart, Dept Comp Sci Elect Engn & Informat Technol, D-70569 Stuttgart, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Algorithms for finding frequent itemsets fall into two broad categories: algorithms that are based on non-trivial SQL statements to query and update a database, and algorithms that employ sophisticated in-memory data structures, where the data is stored in flat files. Most performance experiments have shown that SQL-based approaches are inferior to main-memory algorithms. However, the current trend of database vendors to integrate analysis functionalities into their query execution and optimization components, i.e., "closer to the data," suggests to revisit these results and to search for new, potentially better solutions. We investigate approaches based on SQL-92 and present a new approach called Quiver that employs universal and existential quantifications. In the table schema for itemsets of our approach, a group of tuples represents a single itemset. Such a "vertical" layout is similar to the popular layout used for the transaction table, which is the input of frequent itemset discovery. We show that current DBMS do not provide efficient query processing strategies for dealing with quantified queries, mostly due to the lack of an adequate SQL syntax for set containment tests. Performance tests using a query processor prototype and a novel query operator, called set containment division, promise an improved performance for quantified queries like those used for Quiver.
引用
收藏
页码:194 / 213
页数:20
相关论文
共 50 条
  • [1] Pattern Discovery in Conceptual Models Using Frequent Itemset Mining
    Fumagalli, Mattia
    Sales, Tiago Prince
    Guizzardi, Giancarlo
    CONCEPTUAL MODELING (ER 2022), 2022, 13607 : 52 - 62
  • [2] Approximate Frequent Itemset Discovery from Data Stream
    Ciampi, Anna
    Fumarola, Fabio
    Appice, Annalisa
    Malerba, Donato
    AI (ASTERISK) IA 2009: EMERGENT PERSPECTIVES IN ARTIFICIAL INTELLIGENCE, 2009, 5883 : 151 - 160
  • [3] SEMANTICS AND PROBLEMS OF UNIVERSAL QUANTIFICATION IN SQL
    NEGRI, M
    PELAGATTI, G
    SBATTELLA, L
    COMPUTER JOURNAL, 1989, 32 (01): : 90 - 91
  • [4] A Support-Ordered Trie for fast frequent itemset discovery
    Woon, YK
    Ng, WK
    Lim, EP
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (07) : 875 - 879
  • [5] Temporal aggregates and temporal universal quantification in standard SQL
    Zimanyi, Esteban
    SIGMOD RECORD, 2006, 35 (02) : 16 - 21
  • [6] Frequent Itemset Generation using Double Hashing Technique
    Jayalakshmi, N.
    Vidhya, V.
    Krishnamurthy, M.
    Kannan, A.
    INTERNATIONAL CONFERENCE ON MODELLING OPTIMIZATION AND COMPUTING, 2012, 38 : 1467 - 1478
  • [7] Mining φ-Frequent Itemset Using FP-Tree
    李天瑞
    Journal of Southwest Jiaotong University, 2001, (01) : 67 - 74
  • [8] Frequent itemset mining using cellular learning automata
    Sohrabi, Mohammad Karim
    Roshani, Reza
    COMPUTERS IN HUMAN BEHAVIOR, 2017, 68 : 244 - 253
  • [9] Parallel frequent itemset mining using systolic arrays
    Sohrabi, Mohammad Karim
    Barforoush, Ahmad Abdollahzadeh
    KNOWLEDGE-BASED SYSTEMS, 2013, 37 : 462 - 471
  • [10] Efficiently Using Matrix in Mining Maximum Frequent Itemset
    Liu Zhen-yu
    Xu Wei-xiang
    Liu Xumin
    THIRD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING: WKDD 2010, PROCEEDINGS, 2010, : 50 - 54