New algorithms for finding approximate frequent item sets

被引:7
|
作者
Borgelt, Christian [1 ]
Braune, Christian [1 ,2 ]
Koetter, Tobias [3 ]
Gruen, Sonja [4 ,5 ]
机构
[1] European Ctr Soft Comp, Mieres 33600, Asturias, Spain
[2] Otto Von Guericke Univ, Dept Comp Sci, D-39106 Magdeburg, Germany
[3] Univ Konstanz, Dept Comp Sci, D-78457 Constance, Germany
[4] RIKEN, Brain Sci Inst, Wako, Saitama 3510198, Japan
[5] Res Ctr Julich, Inst Neurosci & Med INM 6, Julich, Germany
关键词
ASSOCIATION; NOISE;
D O I
10.1007/s00500-011-0776-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In standard frequent item set mining a transaction supports an item set only if all items in the set are present. However, in many cases this is too strict a requirement that can render it impossible to find certain relevant groups of items. By relaxing the support definition, allowing for some items of a given set to be missing from a transaction, this drawback can be amended. The resulting item sets have been called approximate, fault-tolerant or fuzzy item sets. In this paper we present two new algorithms to find such item sets: the first is an extension of item set mining based on cover similarities and computes and evaluates the subset size occurrence distribution with a scheme that is related to the Eclat algorithm. The second employs a clustering-like approach, in which the distances are derived from the item covers with distance measures for sets or binary vectors and which is initialized with a one-dimensional Sammon projection of the distance matrix. We demonstrate the benefits of our algorithms by applying them to a concept detection task on the 2008/2009 Wikipedia Selection for schools and to the neurobiological task of detecting neuron ensembles in (simulated) parallel spike trains.
引用
收藏
页码:903 / 917
页数:15
相关论文
共 50 条
  • [21] Algorithm of Frequent Item Sets Mining Based on Index Table
    Zhang Lin
    Yao Nanzhen
    Zhang Jianli
    MECHATRONICS, ROBOTICS AND AUTOMATION, PTS 1-3, 2013, 373-375 : 1076 - +
  • [22] Design and Implementation of Improved Algorithm for Frequent Item Sets Mining
    Zhang Lin
    Zhang Jianli
    PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 1696 - 1698
  • [23] An Improved Association Rules Algorithm based on Frequent Item Sets
    Jiang, Yaqiong
    Wang, Jun
    CEIS 2011, 2011, 15
  • [24] Method for Mining Frequent Item Sets Considering Average Utility
    Agarwal, Reshu
    Gautam, Arti
    Saksena, Ayush Kumar
    Rai, Amrita
    Karatangi, Shylaja VinayKumar
    2021 INTERNATIONAL CONFERENCE ON EMERGING SMART COMPUTING AND INFORMATICS (ESCI), 2021, : 275 - 278
  • [25] Detecting Spatial Trends in Hockey Using Frequent Item Sets
    Morgan, Stuart
    PROCEEDINGS OF THE 8TH INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE IN SPORT (IACSS2011), 2011, : 58 - 61
  • [26] Efficient Mining of Frequent Item Sets on Large Uncertain Databases
    Wang, Liang
    Cheung, David Wai-Lok
    Cheng, Reynold
    Lee, Sau Dan
    Yang, Xuan S.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (12) : 2170 - 2183
  • [27] Parallel algorithm for mining frequent item sets based on Spark
    Mao Y.
    Wu B.
    Xu C.
    Zhang M.
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2023, 29 (04): : 1267 - 1283
  • [28] Efficient algorithm for mining approximate frequent item over data streams
    Wang, Wei-Ping
    Li, Jian-Zhong
    Zhang, Dong-Dong
    Guo, Long-Jiang
    Ruan Jian Xue Bao/Journal of Software, 2007, 18 (04): : 884 - 892
  • [29] On the big data processing algorithms for finding frequent sequences
    Can, Ali Burak
    Zaval, Mounes
    Uzun-Per, Meryem
    Aktas, Mehmet S.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (24):
  • [30] FrogCOL and FrogMIS: new decentralized algorithms for finding large independent sets in graphs
    Christian Blum
    Borja Calvo
    Maria J. Blesa
    Swarm Intelligence, 2015, 9 : 205 - 227