Fuzzy high-utility pattern mining in parallel and distributed Hadoop framework

被引:50
|
作者
Wu, Jimmy Ming-Tai [1 ]
Srivastava, Gautam [2 ,3 ]
Wei, Min [1 ]
Yun, Unil [4 ]
Lin, Jerry Chun-Wei [5 ]
机构
[1] Shandong Univ Sci & Technol, Coll Comp Sci & Engn, Qingdao, Peoples R China
[2] Brandon Univ, Dept Math & Comp Sci, 270 18th St, Brandon, MB R7A 6A9, Canada
[3] China Med Univ, Res Ctr Interneural Comp, Taichung 40402, Taiwan
[4] Sejong Univ, Dept Comp Engn, Seoul, South Korea
[5] Western Norway Univ Appl Sci, Dept Comp Sci Elect Engn & Math Sci, Bergen, Norway
关键词
Hadoop; High fuzzy utility pattern; High utility itemset mining; Big-data; Fuzzy-set theory; MapReduce; ITEMSETS; ALGORITHM; STRATEGY;
D O I
10.1016/j.ins.2020.12.004
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Over the past decade, high-utility itemset mining (HUIM) has received widespread attention that can emphasize more critical information than was previously possible using frequent itemset mining (FIM). Unfortunately, HUIM is very similar to FIM since the methodology determines itemsets using a binary model based on a pre-defined minimum utility threshold. Additionally, most previous works only focused on single, small datasets in HUIM, which is not realistic to any real-world scenarios today containing big data environments. In this work, the fuzzy-set theory and a MapReduce framework are both utilized to design a novel high fuzzy utility pattern mining algorithm to resolve the above issues. Fuzzy-set theory is first involved and a new algorithm called efficient high fuzzy utility itemset mining (EFUPM) is designed to discover high fuzzy utility patterns from a single machine. Two upper-bounds are then estimated to allow early pruning of unpromising candidates in the search space. To handle the large-scale of big datasets, a Hadoop-based high fuzzy utility pattern mining (HFUPM) algorithm is then developed to discover high fuzzy utility patterns based on the Hadoop framework. Experimental results clearly show that the proposed algorithms perform strongly to mine the required high fuzzy utility patterns whether in a single machine or a large-scale environment compared to the current state-of-the-art approaches. (C) 2020 The Author(s). Published by Elsevier Inc.
引用
收藏
页码:31 / 48
页数:18
相关论文
共 50 条
  • [31] Efficiently mining uncertain high-utility itemsets
    Lin, Jerry Chun-Wei
    Gan, Wensheng
    Fournier-Viger, Philippe
    Hong, Tzung-Pei
    Tseng, Vincent S.
    SOFT COMPUTING, 2017, 21 (11) : 2801 - 2820
  • [32] Mining High-Utility Itemsets with Irregular Occurrence
    Laoviboon, Supachai
    Amphawan, Komate
    2017 9TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST), 2017, : 89 - 94
  • [33] Efficiently mining uncertain high-utility itemsets
    Jerry Chun-Wei Lin
    Wensheng Gan
    Philippe Fournier-Viger
    Tzung-Pei Hong
    Vincent S. Tseng
    Soft Computing, 2017, 21 : 2801 - 2820
  • [34] Fast mining local high-utility itemsets
    Song, Wei
    Ren, Guibin
    Gan, Wensheng
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 145
  • [35] QuAX: Mining theWeb for High-utility FAQ
    Rashid, Muhammad Shihab
    Jamour, Fuad
    Hristidis, Vagelis
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 1518 - 1527
  • [36] A survey of incremental high-utility itemset mining
    Gan, Wensheng
    Lin, Jerry Chun-Wei
    Fournier-Viger, Philippe
    Chao, Han-Chieh
    Hong, Tzung-Pei
    Fujita, Hamido
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2018, 8 (02)
  • [37] Synthesization of High-Utility Patterns in Parallel Computing
    Lin, Jerry Chun-Wei
    Li, Yuanfa
    Pirouz, Matin
    Tang, Linlin
    Voznak, Miroslav
    Sevcik, Lukas
    2019 IEEE/ACM 23RD INTERNATIONAL SYMPOSIUM ON DISTRIBUTED SIMULATION AND REAL TIME APPLICATIONS (DS-RT), 2019, : 276 - 282
  • [38] EA-HUFIM: Optimization for Fuzzy-Based High-Utility Itemsets Mining
    Fan Yang
    Nankun Mu
    Xiaofeng Liao
    Xinyu Lei
    International Journal of Fuzzy Systems, 2021, 23 : 1652 - 1668
  • [39] EA-HUFIM: Optimization for Fuzzy-Based High-Utility Itemsets Mining
    Yang, Fan
    Mu, Nankun
    Liao, Xiaofeng
    Lei, Xinyu
    INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2021, 23 (06) : 1652 - 1668
  • [40] Mining Transactional Databases for Frequent and High-Utility Fuzzy Sequential Patterns With Time Intervals
    Ritika
    Gupta, Sunil Kumar
    IEEE ACCESS, 2022, 10 : 71107 - 71119