Frequent itemset discovery with SQL using universal quantification

被引:0
|
作者
Rantzau, R [1 ]
机构
[1] Univ Stuttgart, Dept Comp Sci Elect Engn & Informat Technol, D-70569 Stuttgart, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Algorithms for finding frequent itemsets fall into two broad categories: algorithms that are based on non-trivial SQL statements to query and update a database, and algorithms that employ sophisticated in-memory data structures, where the data is stored in flat files. Most performance experiments have shown that SQL-based approaches are inferior to main-memory algorithms. However, the current trend of database vendors to integrate analysis functionalities into their query execution and optimization components, i.e., "closer to the data," suggests to revisit these results and to search for new, potentially better solutions. We investigate approaches based on SQL-92 and present a new approach called Quiver that employs universal and existential quantifications. In the table schema for itemsets of our approach, a group of tuples represents a single itemset. Such a "vertical" layout is similar to the popular layout used for the transaction table, which is the input of frequent itemset discovery. We show that current DBMS do not provide efficient query processing strategies for dealing with quantified queries, mostly due to the lack of an adequate SQL syntax for set containment tests. Performance tests using a query processor prototype and a novel query operator, called set containment division, promise an improved performance for quantified queries like those used for Quiver.
引用
收藏
页码:194 / 213
页数:20
相关论文
共 50 条
  • [41] A primer to frequent itemset mining for bioinformatics
    Naulaerts, Stefan
    Meysman, Pieter
    Bittremieux, Wout
    Trung Nghia Vu
    Vanden Berghe, Wim
    Goethals, Bart
    Laukens, Kris
    BRIEFINGS IN BIOINFORMATICS, 2015, 16 (02) : 216 - 231
  • [42] An efficient frequent itemset mining algorithm
    Luo, Ke
    Zhang, Xue-Mao
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 756 - 761
  • [43] A parallel algorithm for frequent itemset mining
    Li, L
    Zhai, DH
    Fan, J
    PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PDCAT'2003, PROCEEDINGS, 2003, : 868 - 871
  • [44] Frequent itemset mining with parallel RDBMS
    Shang, XQ
    Sattler, KU
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2005, 3518 : 539 - 544
  • [45] A frequent itemset generation approach in data mining using transaction-labelling dynamic itemset counting method
    Balaram, Ambily
    Raju, Nedunchezhian
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2025, 17 (01)
  • [46] Frequent itemset mining with bit search
    Venkatesan, N.
    Ramaraj
    Journal of Theoretical and Applied Information Technology, 41 (01): : 111 - 121
  • [47] HashEclat: an efficient frequent itemset algorithm
    Zhang, Chunkai
    Tian, Panbo
    Zhang, Xudong
    Liao, Qing
    Jiang, Zoe L.
    Wang, Xuan
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (11) : 3003 - 3016
  • [48] HashEclat: an efficient frequent itemset algorithm
    Chunkai Zhang
    Panbo Tian
    Xudong Zhang
    Qing Liao
    Zoe L. Jiang
    Xuan Wang
    International Journal of Machine Learning and Cybernetics, 2019, 10 : 3003 - 3016
  • [49] Video mining with frequent itemset configurations
    Quack, Till
    Ferrari, Vittorio
    Van Gool, Luc
    IMAGE AND VIDEO RETRIEVAL, PROCEEDINGS, 2006, 4071 : 360 - 369
  • [50] Frequent closed informative itemset mining
    Fu, Huaiguo
    Foghlu, Micheal O.
    Donnelly, Willie
    CIS: 2007 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY, PROCEEDINGS, 2007, : 232 - +