Efficient mining top-k high utility itemsets in incremental databases based on threshold raising strategies and pre-large concept

被引:0
|
作者
Tung, N. T. [1 ,2 ,3 ]
Nguyen, Loan T. T. [2 ,4 ]
Nguyen, Trinh D. D. [3 ]
Huynh, Bao [3 ]
机构
[1] Univ Informat Technol, Fac Comp Sci, Ho Chi Minh City, Vietnam
[2] Vietnam Natl Univ, Ho Chi Minh City, Vietnam
[3] HUTECH Univ, Fac Informat Technol, Ho Chi Minh City, Vietnam
[4] Int Univ, Sch Comp Sci & Engn, Ho Chi Minh City, Vietnam
关键词
Data mining; Incremental databases; Incremental threshold raising strategy; Pre-large; Rescan condition; Top-k high utility itemset; ALGORITHM; PATTERNS;
D O I
10.1016/j.knosys.2025.113273
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High utility itemset mining (HUIM) is a sub-problems of frequent itemset mining (FIM) that has received a lot of interest from researchers. It is used to analyze user behavior and improve business efficiency. The top-k high utility itemsets mining (top-k HUIM) issue aims to explore the k-itemsets with the highest utility from the database to handle the difficulty of threshold selection. Top-k HUIM algorithms ignore the transactions continuously added to the database in a dynamic environment, resulting in inaccurate top-k HUI results. However, the current top-k HUIM algorithms in the incremental database require users to request mining manually, or else, have it automatically processed every time the incremental batch is scanned, which is very small compared to the original database. Re-mining when the data is not updated enough affects the results and consumes a lot of resources without obtain new valuable insights. This research presents a raising threshold strategy to take advantage of the original database's mining results combining the updated database strategies. Furthermore, the paper proposes definitions of top-k mining using pre-large concept, thresholds, conditions for re-mining and method to solve the problem of always mining. Combining the proposed techniques and strategies, a complete "PreTK" algorithm is proposed to solve the proposed issues. The experiments are deployed to compare the algorithm's performance on diverse databases with baseline algorithms. The results demonstrate that the proposed method outperforms the state-of-the-art algorithms and may provide results faster, even when remining is necessary.
引用
收藏
页数:24
相关论文
共 50 条
  • [31] Discovering Top-k Spatial High Utility Itemsets in Very Large Quantitative Spatiotemporal databases
    Pallikila, Pradeep
    Veena, P.
    Kiran, R. Uday
    Avatar, Ram
    Ito, Sadanori
    Zettsu, Koji
    Reddy, P. Krishna
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 4925 - 4935
  • [32] Pre-large based high utility pattern mining for transaction insertions in incremental database
    Kim, Hyeonmo
    Lee, Chanhee
    Ryu, Taewoong
    Kim, Heonho
    Kim, Sinyoung
    Vo, Bay
    Lin, Jerry Chun-Wei
    Yun, Unil
    KNOWLEDGE-BASED SYSTEMS, 2023, 268
  • [33] Efficient mining of closed high-utility itemsets in dynamic and incremental databases
    Vlashejerdi, Mahnaz Naderi
    Daneshpour, Negin
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 144
  • [34] TKC: Mining Top-K Cross-Level High Utility Itemsets
    Nouioua, Mourad
    Wang, Ying
    Fournier-Viger, Philippe
    Lin, Jerry Chun-Wei
    Wu, Jimmy Ming-Tai
    20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2020), 2020, : 673 - 682
  • [35] Efficient algorithm for mining high average-utility itemsets in incremental transaction databases
    Kim, Donggyu
    Yun, Unil
    APPLIED INTELLIGENCE, 2017, 47 (01) : 114 - 131
  • [36] Efficient algorithm for mining high average-utility itemsets in incremental transaction databases
    Donggyu Kim
    Unil Yun
    Applied Intelligence, 2017, 47 : 114 - 131
  • [37] Mining top-k high average-utility itemsets based on breadth-first search
    Xuan Liu
    Genlang Chen
    Fangyu Wu
    Shiting Wen
    Wanli Zuo
    Applied Intelligence, 2023, 53 : 29319 - 29337
  • [38] Mining top-k high average-utility itemsets based on breadth-first search
    Liu, Xuan
    Chen, Genlang
    Wu, Fangyu
    Wen, Shiting
    Zuo, Wanli
    APPLIED INTELLIGENCE, 2023, 53 (23) : 29319 - 29337
  • [39] Heuristically mining the top-k high-utility itemsets with cross-entropy optimization
    Wei Song
    Chuanlong Zheng
    Chaomin Huang
    Lu Liu
    Applied Intelligence, 2022, 52 : 17026 - 17041
  • [40] Heuristically mining the top-k high-utility itemsets with cross-entropy optimization
    Song, Wei
    Zheng, Chuanlong
    Huang, Chaomin
    Liu, Lu
    APPLIED INTELLIGENCE, 2022, 52 (15) : 17026 - 17041