Frequent itemset discovery with SQL using universal quantification

被引:0
|
作者
Rantzau, R [1 ]
机构
[1] Univ Stuttgart, Dept Comp Sci Elect Engn & Informat Technol, D-70569 Stuttgart, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Algorithms for finding frequent itemsets fall into two broad categories: algorithms that are based on non-trivial SQL statements to query and update a database, and algorithms that employ sophisticated in-memory data structures, where the data is stored in flat files. Most performance experiments have shown that SQL-based approaches are inferior to main-memory algorithms. However, the current trend of database vendors to integrate analysis functionalities into their query execution and optimization components, i.e., "closer to the data," suggests to revisit these results and to search for new, potentially better solutions. We investigate approaches based on SQL-92 and present a new approach called Quiver that employs universal and existential quantifications. In the table schema for itemsets of our approach, a group of tuples represents a single itemset. Such a "vertical" layout is similar to the popular layout used for the transaction table, which is the input of frequent itemset discovery. We show that current DBMS do not provide efficient query processing strategies for dealing with quantified queries, mostly due to the lack of an adequate SQL syntax for set containment tests. Performance tests using a query processor prototype and a novel query operator, called set containment division, promise an improved performance for quantified queries like those used for Quiver.
引用
收藏
页码:194 / 213
页数:20
相关论文
共 50 条
  • [21] A Novel Integrated Approach for Companion Vehicle Discovery Based on Frequent Itemset Mining on Spark
    Abdulrahman Al-badwi
    Zhe Long
    Zuping Zhang
    Mohammed Al-habib
    Kamal Al-Sabahi
    Arabian Journal for Science and Engineering, 2019, 44 : 9517 - 9527
  • [22] A Novel Integrated Approach for Companion Vehicle Discovery Based on Frequent Itemset Mining on Spark
    Al-badwi, Abdulrahman
    Long, Zhe
    Zhang, Zuping
    Al-habib, Mohammed
    Al-Sabahi, Kamal
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2019, 44 (11) : 9517 - 9527
  • [23] HDFS Framework for Efficient Frequent Itemset Mining Using MapReduce
    Kulkarni, Prajakta G.
    Khonde, Shraddha R.
    2017 1ST INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND INFORMATION MANAGEMENT (ICISIM), 2017, : 171 - 178
  • [24] Mining frequent sequences using itemset-based extension
    Ma, Zhixin
    Xu, Yusheng
    Dillon, Tharam S.
    Chen Xiaoyun
    IMECS 2008: INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, VOLS I AND II, 2008, : 591 - +
  • [25] A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
    Fumarola, Fabio
    Malerba, Donato
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2014, : 335 - 342
  • [26] Itemset support queries using frequent itemsets and their condensed representations
    Mielikainen, Taneli
    Panov, Pance
    Dzeroski, Saso
    DISCOVERY SCIENCE, PROCEEDINGS, 2006, 4265 : 161 - 172
  • [27] Infrequent Weighted Itemset Mining Using Frequent Pattern Growth
    Cagliero, Luca
    Garza, Paolo
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (04) : 903 - 915
  • [28] Frequent Pattern using Multiple Attribute Value for Itemset Generation
    Long, Zalizah Awang
    Abu Bakar, Azuraliza
    Hamdan, Abdul Razak
    2011 3RD CONFERENCE ON DATA MINING AND OPTIMIZATION (DMO), 2011, : 44 - 50
  • [29] Implementation of an Improved Algorithm for Frequent Itemset Mining using Hadoop
    Agarwal, Ruchi
    Singh, Sunny
    Vats, Satvik
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2016, : 13 - 18
  • [30] Grafting for combinatorial binary model using frequent itemset mining
    Lee, Taito
    Matsushima, Shin
    Yamanishi, Kenji
    DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (01) : 101 - 123