Synthesizing High-Utility Patterns from Different Data Sources

被引：0

作者：

Muley, Abhinav ^{[1
]}

Gudadhe, Manish ^{[1
]}

机构：

[1] St Vincent Pallotti Coll Engn & Technol, Dept Comp Engn, Nagpur 441108, Maharashtra, India

来源：

DATA | 2018年 / 3卷 / 03期

关键词：

data integration; data mining; high-utility patterns; knowledge discovery; weighted model; multi-database mining; distributed data mining;

D O I：

10.3390/data3030032

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In large organizations, it is often required to collect data from the different geographic branches spread over different locations. Extensive amounts of data may be gathered at the centralized location in order to generate interesting patterns via mono-mining the amassed database. However, it is feasible to mine the useful patterns at the data source itself and forward only these patterns to the centralized company, rather than the entire original database. These patterns also exist in huge numbers, and different sources calculate different utility values for each pattern. This paper proposes a weighted model for aggregating the high-utility patterns from different data sources. The procedure of pattern selection was also proposed to efficiently extract high-utility patterns in our weighted model by discarding low-utility patterns. Meanwhile, the synthesizing model yielded high-utility patterns, unlike association rule mining, in which frequent itemsets are generated by considering each item with equal utility, which is not true in real life applications such as sales transactions. Extensive experiments performed on the datasets with varied characteristics show that the proposed algorithm will be effective for mining very sparse and sparse databases with a huge number of transactions. Our proposed model also outperforms various state-of-the-art distributed models of mining in terms of running time.

引用

页数：16

共 50 条

[31] Mining top-N high-utility operation patterns for taxi drivers
Liu, Caihong
Guo, Chonghui
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 170 (170)
[32] MHUI-max: An efficient algorithm for discovering high-utility itemsets from data streams
Li, Hua-Fu
JOURNAL OF INFORMATION SCIENCE, 2011, 37 (05) : 532 - 545
[33] Diffix: High-Utility Database Anonymization
Francis, Paul
Eide, Sebastian Probst
Munz, Reinhard
PRIVACY TECHNOLOGIES AND POLICY, APF 2017, 2017, 10518 : 141 - 158
[34] High-utility and diverse itemset mining
Amit Verma
Siddharth Dawar
Raman Kumar
Shamkant Navathe
Vikram Goyal
Applied Intelligence, 2021, 51 : 4649 - 4663
[35] Multi-core parallel algorithms for hiding high-utility sequential patterns
Ut Huynh
Bac Le
Duy-Tai Dinh
Fujita, Hamido
KNOWLEDGE-BASED SYSTEMS, 2022, 237
[36] Mining Minimal High-Utility Itemsets
Fournier-Viger, Philippe
Lin, Jerry Chun-Wei
Wu, Cheng-Wei
Tseng, Vincent S.
Faghihi, Usef
DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2016, PT I, 2016, 9827 : 88 - 101
[37] Mining of High-Utility Sequence Patterns in Large-Scale Uncertain Databases
Wu, Jimmy Ming-Tai
Liu, Shuo
Lin, Jerry Chun-Wei
2022 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2022, : 1103 - 1109
[38] Efficient High-utility Itemset Mining Based on a Novel Data Structure
Shen, Wei
Zhang, Chao
Fang, Wei
Zhang, Xin
Than, Zhi-Hui
Lin, Jerry Chun-Wei
2021 IEEE INTERNATIONAL SMART CITIES CONFERENCE (ISC2), 2021,
[39] Efficient high-utility occupancy itemset mining algorithm on massive data
He, Jingxuan
Han, Xixian
Wang, Jinbao
Zhang, Kaiqi
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 210
[40] Targeted High-Utility Itemset Querying
Miao J.
Wan S.
Gan W.
Sun J.
Chen J.
IEEE Transactions on Artificial Intelligence, 2023, 4 (04): : 871 - 883

← 1 2 3 4 5 →