Synthesizing High-Utility Patterns from Different Data Sources

被引:0
|
作者
Muley, Abhinav [1 ]
Gudadhe, Manish [1 ]
机构
[1] St Vincent Pallotti Coll Engn & Technol, Dept Comp Engn, Nagpur 441108, Maharashtra, India
关键词
data integration; data mining; high-utility patterns; knowledge discovery; weighted model; multi-database mining; distributed data mining;
D O I
10.3390/data3030032
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In large organizations, it is often required to collect data from the different geographic branches spread over different locations. Extensive amounts of data may be gathered at the centralized location in order to generate interesting patterns via mono-mining the amassed database. However, it is feasible to mine the useful patterns at the data source itself and forward only these patterns to the centralized company, rather than the entire original database. These patterns also exist in huge numbers, and different sources calculate different utility values for each pattern. This paper proposes a weighted model for aggregating the high-utility patterns from different data sources. The procedure of pattern selection was also proposed to efficiently extract high-utility patterns in our weighted model by discarding low-utility patterns. Meanwhile, the synthesizing model yielded high-utility patterns, unlike association rule mining, in which frequent itemsets are generated by considering each item with equal utility, which is not true in real life applications such as sales transactions. Extensive experiments performed on the datasets with varied characteristics show that the proposed algorithm will be effective for mining very sparse and sparse databases with a huge number of transactions. Our proposed model also outperforms various state-of-the-art distributed models of mining in terms of running time.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] High-utility and diverse itemset mining
    Verma, Amit
    Dawar, Siddharth
    Kumar, Raman
    Navathe, Shamkant
    Goyal, Vikram
    APPLIED INTELLIGENCE, 2021, 51 (07) : 4649 - 4663
  • [42] A Hybrid Method for High-Utility Itemsets Mining in Large High-Dimensional Data
    Yu, Guangzhu
    Shao, Shihuang
    Luo, Bin
    Zeng, Xianhui
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2009, 5 (01) : 57 - 73
  • [43] Targeted High-Utility Itemset Querying
    Miao, Jinbao
    Wan, Shicheng
    Gan, Wensheng
    Sun, Jiayi
    Chen, Jiahui
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5534 - 5543
  • [44] DPShield: Optimizing Differential Privacy for High-Utility Data Analysis in Sensitive Domains
    Thantharate, Pratik
    Bhojwani, Shyam
    Thantharate, Anurag
    ELECTRONICS, 2024, 13 (12)
  • [45] Scalable Mining of High-Utility Sequential Patterns With Three-Tier MapReduce Model
    Lin, Jerry Chun-Wei
    Djenouri, Youcef
    Srivastava, Gautam
    Li, Yuanfa
    Yu, Philip S.
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2022, 16 (03)
  • [46] Mining for Enthalpy-Based Average High-Utility Patterns with Tighter Upper Bounds
    Vankdothu R.
    Hameed M.A.
    SN Computer Science, 4 (1)
  • [47] A Pre-Large Weighted-Fusion System of Sensed High-Utility Patterns
    Srivastava, Gautam
    Lin, Jerry Chun-Wei
    Pirouz, Matin
    Li, Yuanfa
    Yun, Unil
    IEEE SENSORS JOURNAL, 2021, 21 (14) : 15626 - 15634
  • [48] Efficient Mining of High-Utility Sequential Rules
    Zida, Souleymane
    Fournier-Viger, Philippe
    Wu, Cheng-Wei
    Lin, Jerry Chun-Wei
    Tseng, Vincent S.
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, MLDM 2015, 2015, 9166 : 157 - 171
  • [49] PHM: Mining Periodic High-Utility Itemsets
    Fournier-Viger, Philippe
    Lin, Jerry Chun-Wei
    Quang-Huy Duong
    Thu-Lan Dam
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS, 2016, 9728 : 64 - 79
  • [50] Mining Transactional Databases for Frequent and High-Utility Fuzzy Sequential Patterns With Time Intervals
    Ritika
    Gupta, Sunil Kumar
    IEEE ACCESS, 2022, 10 : 71107 - 71119