Frequent itemset discovery with SQL using universal quantification

被引：0

作者：

Rantzau, R ^{[1
]}

机构：

[1] Univ Stuttgart, Dept Comp Sci Elect Engn & Informat Technol, D-70569 Stuttgart, Germany

来源：

DATABASE SUPPORT FOR DATA MINING APPLICATIONS: DISCOVERING KNOWLEDGE WITH INDUCTIVE QUERIES | 2004年 / 2682卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Algorithms for finding frequent itemsets fall into two broad categories: algorithms that are based on non-trivial SQL statements to query and update a database, and algorithms that employ sophisticated in-memory data structures, where the data is stored in flat files. Most performance experiments have shown that SQL-based approaches are inferior to main-memory algorithms. However, the current trend of database vendors to integrate analysis functionalities into their query execution and optimization components, i.e., "closer to the data," suggests to revisit these results and to search for new, potentially better solutions. We investigate approaches based on SQL-92 and present a new approach called Quiver that employs universal and existential quantifications. In the table schema for itemsets of our approach, a group of tuples represents a single itemset. Such a "vertical" layout is similar to the popular layout used for the transaction table, which is the input of frequent itemset discovery. We show that current DBMS do not provide efficient query processing strategies for dealing with quantified queries, mostly due to the lack of an adequate SQL syntax for set containment tests. Performance tests using a query processor prototype and a novel query operator, called set containment division, promise an improved performance for quantified queries like those used for Quiver.

引用

页码：194 / 213

页数：20

共 50 条

[1] Pattern Discovery in Conceptual Models Using Frequent Itemset Mining
Fumagalli, Mattia
Sales, Tiago Prince
Guizzardi, Giancarlo
CONCEPTUAL MODELING (ER 2022), 2022, 13607 : 52 - 62
[2] Approximate Frequent Itemset Discovery from Data Stream
Ciampi, Anna
Fumarola, Fabio
Appice, Annalisa
Malerba, Donato
AI (ASTERISK) IA 2009: EMERGENT PERSPECTIVES IN ARTIFICIAL INTELLIGENCE, 2009, 5883 : 151 - 160
[3] SEMANTICS AND PROBLEMS OF UNIVERSAL QUANTIFICATION IN SQL
NEGRI, M
PELAGATTI, G
SBATTELLA, L
COMPUTER JOURNAL, 1989, 32 (01): : 90 - 91
[4] A Support-Ordered Trie for fast frequent itemset discovery
Woon, YK
Ng, WK
Lim, EP
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (07) : 875 - 879
[5] Temporal aggregates and temporal universal quantification in standard SQL
Zimanyi, Esteban
SIGMOD RECORD, 2006, 35 (02) : 16 - 21
[6] Frequent Itemset Generation using Double Hashing Technique
Jayalakshmi, N.
Vidhya, V.
Krishnamurthy, M.
Kannan, A.
INTERNATIONAL CONFERENCE ON MODELLING OPTIMIZATION AND COMPUTING, 2012, 38 : 1467 - 1478
[7] Mining φ-Frequent Itemset Using FP-Tree
李天瑞
Journal of Southwest Jiaotong University, 2001, (01) : 67 - 74
[8] Frequent itemset mining using cellular learning automata
Sohrabi, Mohammad Karim
Roshani, Reza
COMPUTERS IN HUMAN BEHAVIOR, 2017, 68 : 244 - 253
[9] Parallel frequent itemset mining using systolic arrays
Sohrabi, Mohammad Karim
Barforoush, Ahmad Abdollahzadeh
KNOWLEDGE-BASED SYSTEMS, 2013, 37 : 462 - 471
[10] Efficiently Using Matrix in Mining Maximum Frequent Itemset
Liu Zhen-yu
Xu Wei-xiang
Liu Xumin
THIRD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING: WKDD 2010, PROCEEDINGS, 2010, : 50 - 54

← 1 2 3 4 5 →