Fuzzy high-utility pattern mining in parallel and distributed Hadoop framework

被引:50
|
作者
Wu, Jimmy Ming-Tai [1 ]
Srivastava, Gautam [2 ,3 ]
Wei, Min [1 ]
Yun, Unil [4 ]
Lin, Jerry Chun-Wei [5 ]
机构
[1] Shandong Univ Sci & Technol, Coll Comp Sci & Engn, Qingdao, Peoples R China
[2] Brandon Univ, Dept Math & Comp Sci, 270 18th St, Brandon, MB R7A 6A9, Canada
[3] China Med Univ, Res Ctr Interneural Comp, Taichung 40402, Taiwan
[4] Sejong Univ, Dept Comp Engn, Seoul, South Korea
[5] Western Norway Univ Appl Sci, Dept Comp Sci Elect Engn & Math Sci, Bergen, Norway
关键词
Hadoop; High fuzzy utility pattern; High utility itemset mining; Big-data; Fuzzy-set theory; MapReduce; ITEMSETS; ALGORITHM; STRATEGY;
D O I
10.1016/j.ins.2020.12.004
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Over the past decade, high-utility itemset mining (HUIM) has received widespread attention that can emphasize more critical information than was previously possible using frequent itemset mining (FIM). Unfortunately, HUIM is very similar to FIM since the methodology determines itemsets using a binary model based on a pre-defined minimum utility threshold. Additionally, most previous works only focused on single, small datasets in HUIM, which is not realistic to any real-world scenarios today containing big data environments. In this work, the fuzzy-set theory and a MapReduce framework are both utilized to design a novel high fuzzy utility pattern mining algorithm to resolve the above issues. Fuzzy-set theory is first involved and a new algorithm called efficient high fuzzy utility itemset mining (EFUPM) is designed to discover high fuzzy utility patterns from a single machine. Two upper-bounds are then estimated to allow early pruning of unpromising candidates in the search space. To handle the large-scale of big datasets, a Hadoop-based high fuzzy utility pattern mining (HFUPM) algorithm is then developed to discover high fuzzy utility patterns based on the Hadoop framework. Experimental results clearly show that the proposed algorithms perform strongly to mine the required high fuzzy utility patterns whether in a single machine or a large-scale environment compared to the current state-of-the-art approaches. (C) 2020 The Author(s). Published by Elsevier Inc.
引用
收藏
页码:31 / 48
页数:18
相关论文
共 50 条
  • [21] A hybrid framework for mining high-utility itemsets in a sparse transaction database
    Siddharth Dawar
    Vikram Goyal
    Debajyoti Bera
    Applied Intelligence, 2017, 47 : 809 - 827
  • [22] High-utility and diverse itemset mining
    Amit Verma
    Siddharth Dawar
    Raman Kumar
    Shamkant Navathe
    Vikram Goyal
    Applied Intelligence, 2021, 51 : 4649 - 4663
  • [23] A hybrid framework for mining high-utility itemsets in a sparse transaction database
    Dawar, Siddharth
    Goyal, Vikram
    Bera, Debajyoti
    APPLIED INTELLIGENCE, 2017, 47 (03) : 809 - 827
  • [24] Mining Minimal High-Utility Itemsets
    Fournier-Viger, Philippe
    Lin, Jerry Chun-Wei
    Wu, Cheng-Wei
    Tseng, Vincent S.
    Faghihi, Usef
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2016, PT I, 2016, 9827 : 88 - 101
  • [25] ParaDiS: a Parallel and Distributed framework for Significant pattern mining
    Jyoti
    Kailasam, Sriram
    Buzmakov, Aleksey
    2023 IEEE/ACM 23RD INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING WORKSHOPS, CCGRIDW, 2023, : 249 - 255
  • [26] High-utility and diverse itemset mining
    Verma, Amit
    Dawar, Siddharth
    Kumar, Raman
    Navathe, Shamkant
    Goyal, Vikram
    APPLIED INTELLIGENCE, 2021, 51 (07) : 4649 - 4663
  • [27] Efficient Mining of High-Utility Sequential Rules
    Zida, Souleymane
    Fournier-Viger, Philippe
    Wu, Cheng-Wei
    Lin, Jerry Chun-Wei
    Tseng, Vincent S.
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, MLDM 2015, 2015, 9166 : 157 - 171
  • [28] PHM: Mining Periodic High-Utility Itemsets
    Fournier-Viger, Philippe
    Lin, Jerry Chun-Wei
    Quang-Huy Duong
    Thu-Lan Dam
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS, 2016, 9728 : 64 - 79
  • [29] Mining High-Utility Patterns in Uncertain Tensors
    Coussat, Aurelien
    Nadisic, Nicolas
    Cerf, Loic
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KES-2018), 2018, 126 : 403 - 412
  • [30] High-Utility Itemset Mining in Big Dataset
    Wu, Jimmy Ming-Tai
    Lin, Jerry Chun-Wei
    Chen, Chien-Ming
    2019 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TW), 2019,