Fast algorithm for high utility pattern mining with the sum of item quantities

被引:35
|
作者
Ryang, Heungmo [1 ]
Yun, Unil [1 ]
Ryu, Keun Ho [2 ]
机构
[1] Sejong Univ, Dept Comp Engn, Seoul, South Korea
[2] Chungbuk Natl Univ, Dept Comp Sci, Cheongju, South Korea
基金
新加坡国家研究基金会;
关键词
Data mining; high utility patterns; single-pass tree construction; tree restructuring; utility mining; FREQUENT ITEMSETS;
D O I
10.3233/IDA-160811
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In frequent pattern mining, items are considered as having the same importance in a database and their occurrence are represented as binary values in transactions. In real-world databases, however, items not only have relative importance but also are represented as non-binary values in transactions. High utility pattern mining is one of the most essential issues in the pattern mining field, which recently emerged to address the limitation of frequent pattern mining. Meanwhile, tree construction with a single database scan is significant since a database scan is a time-consuming task. In utility mining, an additional database scan is necessary to identify actual high utility patterns from candidates. In this paper, we propose a novel tree structure, namely SIQ-Tree (Sum of Item Quantities), which captures database information through a single-pass. Moreover, a restructuring method is suggested with strategies for reducing overestimated utilities. The proposed algorithm can construct the SIQ-Tree with only a single scan and decrease the number of candidate patterns effectively with the reduced overestimation utilities, through which mining performance is improved. Experimental results show that our algorithm outperforms a state-of-the-art one in terms of runtime and the number of generated candidates with a similar memory usage.
引用
收藏
页码:395 / 415
页数:21
相关论文
共 50 条
  • [31] A Parallel Algorithm for Mining High Utility Itemsets
    Nguyen, Trinh D. D.
    Nguyen, Loan T. T.
    Bay Vo
    INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, ISAT 2018, PT II, 2019, 853 : 286 - 295
  • [32] An incremental mining algorithm for high utility itemsets
    Lin, Chun-Wei
    Lan, Guo-Cheng
    Hong, Tzung-Pei
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (08) : 7173 - 7180
  • [33] A Novel Algorithm for Mining High Utility Itemsets
    Bac Le
    Huy Nguyen
    Tung Anh Cao
    Bay Vo
    2009 FIRST ASIAN CONFERENCE ON INTELLIGENT INFORMATION AND DATABASE SYSTEMS, 2009, : 13 - 17
  • [34] Novel Algorithm for Mining High Utility Itemsets
    Shankar, S.
    Purusothaman, T.
    Jayanthi, S.
    ICCN: 2008 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING, 2008, : 619 - +
  • [35] Applying the maximum utility measure in high utility sequential pattern mining
    Lan, Guo-Cheng
    Hong, Tzung-Pei
    Tseng, Vincent S.
    Wang, Shyue-Liang
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (11) : 5071 - 5081
  • [36] High utility pattern mining algorithm over data streams using ext-list
    Han, Meng
    Li, Muhang
    Chen, Zhiqiang
    Wu, Hongxin
    Zhang, Xilong
    APPLIED INTELLIGENCE, 2023, 53 (22) : 27072 - 27095
  • [37] High Average-Utility Pattern Mining Based on Genetic Algorithm with a Novel Pruning Strategy
    Chen, Qiao
    Fang, Wei
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT I, ICIC 2024, 2024, 14862 : 3 - 13
  • [38] Distributed Algorithm for High-Utility Subgraph Pattern Mining over Big Data Platforms
    Khare, Alind
    Goyal, Vikram
    Baride, Srikanth
    Prasad, Sushil K.
    McDermott, Michael
    Shah, Dhara
    2017 IEEE 24TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2017, : 263 - 272
  • [39] High utility pattern mining algorithm over data streams using ext-list.
    Meng Han
    Muhang Li
    Zhiqiang Chen
    Hongxin Wu
    Xilong Zhang
    Applied Intelligence, 2023, 53 : 27072 - 27095
  • [40] Distributed Mining of High Utility Sequential Patterns with Negative Item Values
    Varma, Manoj
    Sumalatha, Saleti
    Reddy, Akhileshwar
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (03) : 592 - 598