New algorithms for finding approximate frequent item sets

被引:7
|
作者
Borgelt, Christian [1 ]
Braune, Christian [1 ,2 ]
Koetter, Tobias [3 ]
Gruen, Sonja [4 ,5 ]
机构
[1] European Ctr Soft Comp, Mieres 33600, Asturias, Spain
[2] Otto Von Guericke Univ, Dept Comp Sci, D-39106 Magdeburg, Germany
[3] Univ Konstanz, Dept Comp Sci, D-78457 Constance, Germany
[4] RIKEN, Brain Sci Inst, Wako, Saitama 3510198, Japan
[5] Res Ctr Julich, Inst Neurosci & Med INM 6, Julich, Germany
关键词
ASSOCIATION; NOISE;
D O I
10.1007/s00500-011-0776-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In standard frequent item set mining a transaction supports an item set only if all items in the set are present. However, in many cases this is too strict a requirement that can render it impossible to find certain relevant groups of items. By relaxing the support definition, allowing for some items of a given set to be missing from a transaction, this drawback can be amended. The resulting item sets have been called approximate, fault-tolerant or fuzzy item sets. In this paper we present two new algorithms to find such item sets: the first is an extension of item set mining based on cover similarities and computes and evaluates the subset size occurrence distribution with a scheme that is related to the Eclat algorithm. The second employs a clustering-like approach, in which the distances are derived from the item covers with distance measures for sets or binary vectors and which is initialized with a one-dimensional Sammon projection of the distance matrix. We demonstrate the benefits of our algorithms by applying them to a concept detection task on the 2008/2009 Wikipedia Selection for schools and to the neurobiological task of detecting neuron ensembles in (simulated) parallel spike trains.
引用
收藏
页码:903 / 917
页数:15
相关论文
共 50 条
  • [1] New algorithms for finding approximate frequent item sets
    Christian Borgelt
    Christian Braune
    Tobias Kötter
    Sonja Grün
    Soft Computing, 2012, 16 : 903 - 917
  • [2] Finding Frequent Item Sets from Sparse Matrix
    Zheng Xiao-Yan
    Sun Ji-Zhou
    Zheng Xiao-Yan
    ICECT: 2009 INTERNATIONAL CONFERENCE ON ELECTRONIC COMPUTER TECHNOLOGY, PROCEEDINGS, 2009, : 615 - 619
  • [3] On Genetic Algorithms for Detecting Frequent Item Sets And Large Bite Sets
    Sizov, Roman A.
    Simovici, Dan A.
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION (MLDM 2016), 2016, 9729 : 435 - 445
  • [4] An improved parallel algorithm for finding frequent item-sets
    She, CD
    Li, L
    Wang, HB
    Gao, B
    Deng, HQ
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON INTELLIGENT MECHATRONICS AND AUTOMATION, 2004, : 383 - 386
  • [5] Multicore Framework for Finding Frequent Item-Sets Using TDS
    Khawaja, Sajid Gul
    Tehreem, Amna
    Akram, M. Usman
    Khan, Shoab Ahmed
    PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS (HIS 2016), 2017, 552 : 340 - 349
  • [6] Finding frequent subgraphs in biological networks via maximal item sets
    Zantema, Hans
    Wagemans, Stefan
    Bosnacki, Dragan
    BIOINFORMATICS RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2008, 13 : 303 - +
  • [7] A new mining algorithm based on frequent item sets
    Wen Yun
    FIRST INTERNATIONAL WORKSHOP ON KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, : 410 - 413
  • [8] Fast algorithms for finding first-class frequent itemsets with item constraints
    Gao, Fei
    Xie, Wei-Xin
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2001, 38 (11):
  • [9] Study on the Discovery Algorithm of the Frequent Item Sets
    Cheng, Huifeng
    Ma, Yanli
    Li, Fangping
    2009 INTERNATIONAL ASIA SYMPOSIUM ON INTELLIGENT INTERACTION AND AFFECTIVE COMPUTING, 2009, : 172 - +
  • [10] Finding neural assemblies with frequent item set mining
    Picado-Muino, David
    Borgelt, Christian
    Berger, Denise
    Gerstein, George
    Gruen, Sonja
    FRONTIERS IN NEUROINFORMATICS, 2013, 7