Synthesizing High-Utility Patterns from Different Data Sources

被引:0
|
作者
Muley, Abhinav [1 ]
Gudadhe, Manish [1 ]
机构
[1] St Vincent Pallotti Coll Engn & Technol, Dept Comp Engn, Nagpur 441108, Maharashtra, India
关键词
data integration; data mining; high-utility patterns; knowledge discovery; weighted model; multi-database mining; distributed data mining;
D O I
10.3390/data3030032
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In large organizations, it is often required to collect data from the different geographic branches spread over different locations. Extensive amounts of data may be gathered at the centralized location in order to generate interesting patterns via mono-mining the amassed database. However, it is feasible to mine the useful patterns at the data source itself and forward only these patterns to the centralized company, rather than the entire original database. These patterns also exist in huge numbers, and different sources calculate different utility values for each pattern. This paper proposes a weighted model for aggregating the high-utility patterns from different data sources. The procedure of pattern selection was also proposed to efficiently extract high-utility patterns in our weighted model by discarding low-utility patterns. Meanwhile, the synthesizing model yielded high-utility patterns, unlike association rule mining, in which frequent itemsets are generated by considering each item with equal utility, which is not true in real life applications such as sales transactions. Extensive experiments performed on the datasets with varied characteristics show that the proposed algorithm will be effective for mining very sparse and sparse databases with a huge number of transactions. Our proposed model also outperforms various state-of-the-art distributed models of mining in terms of running time.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Mining top-N high-utility operation patterns for taxi drivers
    Liu, Caihong
    Guo, Chonghui
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 170 (170)
  • [32] MHUI-max: An efficient algorithm for discovering high-utility itemsets from data streams
    Li, Hua-Fu
    JOURNAL OF INFORMATION SCIENCE, 2011, 37 (05) : 532 - 545
  • [33] Diffix: High-Utility Database Anonymization
    Francis, Paul
    Eide, Sebastian Probst
    Munz, Reinhard
    PRIVACY TECHNOLOGIES AND POLICY, APF 2017, 2017, 10518 : 141 - 158
  • [34] High-utility and diverse itemset mining
    Amit Verma
    Siddharth Dawar
    Raman Kumar
    Shamkant Navathe
    Vikram Goyal
    Applied Intelligence, 2021, 51 : 4649 - 4663
  • [35] Multi-core parallel algorithms for hiding high-utility sequential patterns
    Ut Huynh
    Bac Le
    Duy-Tai Dinh
    Fujita, Hamido
    KNOWLEDGE-BASED SYSTEMS, 2022, 237
  • [36] Mining Minimal High-Utility Itemsets
    Fournier-Viger, Philippe
    Lin, Jerry Chun-Wei
    Wu, Cheng-Wei
    Tseng, Vincent S.
    Faghihi, Usef
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2016, PT I, 2016, 9827 : 88 - 101
  • [37] Mining of High-Utility Sequence Patterns in Large-Scale Uncertain Databases
    Wu, Jimmy Ming-Tai
    Liu, Shuo
    Lin, Jerry Chun-Wei
    2022 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2022, : 1103 - 1109
  • [38] Efficient High-utility Itemset Mining Based on a Novel Data Structure
    Shen, Wei
    Zhang, Chao
    Fang, Wei
    Zhang, Xin
    Than, Zhi-Hui
    Lin, Jerry Chun-Wei
    2021 IEEE INTERNATIONAL SMART CITIES CONFERENCE (ISC2), 2021,
  • [39] Efficient high-utility occupancy itemset mining algorithm on massive data
    He, Jingxuan
    Han, Xixian
    Wang, Jinbao
    Zhang, Kaiqi
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 210
  • [40] Targeted High-Utility Itemset Querying
    Miao J.
    Wan S.
    Gan W.
    Sun J.
    Chen J.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (04): : 871 - 883