Synthesizing High-Utility Patterns from Different Data Sources

被引:0
|
作者
Muley, Abhinav [1 ]
Gudadhe, Manish [1 ]
机构
[1] St Vincent Pallotti Coll Engn & Technol, Dept Comp Engn, Nagpur 441108, Maharashtra, India
关键词
data integration; data mining; high-utility patterns; knowledge discovery; weighted model; multi-database mining; distributed data mining;
D O I
10.3390/data3030032
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In large organizations, it is often required to collect data from the different geographic branches spread over different locations. Extensive amounts of data may be gathered at the centralized location in order to generate interesting patterns via mono-mining the amassed database. However, it is feasible to mine the useful patterns at the data source itself and forward only these patterns to the centralized company, rather than the entire original database. These patterns also exist in huge numbers, and different sources calculate different utility values for each pattern. This paper proposes a weighted model for aggregating the high-utility patterns from different data sources. The procedure of pattern selection was also proposed to efficiently extract high-utility patterns in our weighted model by discarding low-utility patterns. Meanwhile, the synthesizing model yielded high-utility patterns, unlike association rule mining, in which frequent itemsets are generated by considering each item with equal utility, which is not true in real life applications such as sales transactions. Extensive experiments performed on the datasets with varied characteristics show that the proposed algorithm will be effective for mining very sparse and sparse databases with a huge number of transactions. Our proposed model also outperforms various state-of-the-art distributed models of mining in terms of running time.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Synthesizing Global Exceptional Patterns in Different Data Sources
    Adhikari, Animesh
    JOURNAL OF INTELLIGENT SYSTEMS, 2012, 21 (03) : 293 - 323
  • [2] Mining High-Utility Sequential Patterns from Big Datasets
    Lin, Jerry Chun-Wei
    Li, Yuanfa
    Fournier-Viger, Philippe
    Djenouri, Youcef
    Wang, Leon Shyue-Liang
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 2674 - 2680
  • [3] Synthesizing high-frequency rules from different data sources
    Wu, XD
    Zhang, SC
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003, 15 (02) : 353 - 367
  • [4] Synthesization of High-Utility Patterns in Parallel Computing
    Lin, Jerry Chun-Wei
    Li, Yuanfa
    Pirouz, Matin
    Tang, Linlin
    Voznak, Miroslav
    Sevcik, Lukas
    2019 IEEE/ACM 23RD INTERNATIONAL SYMPOSIUM ON DISTRIBUTED SIMULATION AND REAL TIME APPLICATIONS (DS-RT), 2019, : 276 - 282
  • [5] Mining High-Utility Patterns in Uncertain Tensors
    Coussat, Aurelien
    Nadisic, Nicolas
    Cerf, Loic
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KES-2018), 2018, 126 : 403 - 412
  • [6] Mining High-utility Temporal Patterns on Time Interval-based Data
    Wang, Jun-Zhe
    Chen, Yi-Cheng
    Shih, Wen-Yueh
    Yang, Lin
    Liu, Yu-Shao
    Huang, Jiun-Long
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2020, 11 (04)
  • [7] Discovering Approximate and Significant High-Utility Patterns from Transactional Datasets
    Tang, Huijun
    Wang, Le
    Liu, Yangguang
    Qian, Jiangbo
    JOURNAL OF MATHEMATICS, 2022, 2022
  • [8] Mining of High-Utility Patterns in Big IoT Databases
    Wu, Jimmy Ming-Tai
    Srivastava, Gautam
    Lin, Jerry Chun-Wei
    Djenouri, Youcef
    Wei, Min
    Polap, Dawid
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING (ICAISC 2021), PT II, 2021, 12855 : 205 - 216
  • [9] Mining High-Utility Sequential Patterns in Uncertain Databases
    Lin, Jerry Chun-Wei
    Srivastava, Gautam
    Li, Yuanfa
    Hong, Tzung-Pei
    Wang, Shyue-Liang
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 5373 - 5380
  • [10] Synthesizing High-Utility Tabular Data with Enhanced Privacy via Split-and-Discard Pre-training
    Luo, Liwei
    Huang, Heyuan
    Zhang, Bingbing
    Xie, Yankai
    Zhang, Chi
    Wei, Lingbo
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 6012 - 6017