Efficient mining top-k high utility itemsets in incremental databases based on threshold raising strategies and pre-large concept

被引:0
|
作者
Tung, N. T. [1 ,2 ,3 ]
Nguyen, Loan T. T. [2 ,4 ]
Nguyen, Trinh D. D. [3 ]
Huynh, Bao [3 ]
机构
[1] Univ Informat Technol, Fac Comp Sci, Ho Chi Minh City, Vietnam
[2] Vietnam Natl Univ, Ho Chi Minh City, Vietnam
[3] HUTECH Univ, Fac Informat Technol, Ho Chi Minh City, Vietnam
[4] Int Univ, Sch Comp Sci & Engn, Ho Chi Minh City, Vietnam
关键词
Data mining; Incremental databases; Incremental threshold raising strategy; Pre-large; Rescan condition; Top-k high utility itemset; ALGORITHM; PATTERNS;
D O I
10.1016/j.knosys.2025.113273
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High utility itemset mining (HUIM) is a sub-problems of frequent itemset mining (FIM) that has received a lot of interest from researchers. It is used to analyze user behavior and improve business efficiency. The top-k high utility itemsets mining (top-k HUIM) issue aims to explore the k-itemsets with the highest utility from the database to handle the difficulty of threshold selection. Top-k HUIM algorithms ignore the transactions continuously added to the database in a dynamic environment, resulting in inaccurate top-k HUI results. However, the current top-k HUIM algorithms in the incremental database require users to request mining manually, or else, have it automatically processed every time the incremental batch is scanned, which is very small compared to the original database. Re-mining when the data is not updated enough affects the results and consumes a lot of resources without obtain new valuable insights. This research presents a raising threshold strategy to take advantage of the original database's mining results combining the updated database strategies. Furthermore, the paper proposes definitions of top-k mining using pre-large concept, thresholds, conditions for re-mining and method to solve the problem of always mining. Combining the proposed techniques and strategies, a complete "PreTK" algorithm is proposed to solve the proposed issues. The experiments are deployed to compare the algorithm's performance on diverse databases with baseline algorithms. The results demonstrate that the proposed method outperforms the state-of-the-art algorithms and may provide results faster, even when remining is necessary.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] DEVELOPMENT OF AN EFFICIENT TECHNIQUE FOR MINING TOP-K CLOSED HIGH UTILITY ITEMSETS
    Velayudhan, Baby
    Sakthivel
    Subasree
    IIOAB JOURNAL, 2016, 7 (09) : 150 - 155
  • [22] Updating high average-utility itemsets with pre-large concept
    Wu, Jimmy Ming-Tai
    Teng, Qian
    Lin, Jerry Chun-Wei
    Yun, Unil
    Chen, Hsing-Chung
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (05) : 5831 - 5840
  • [23] Using the Pre-large Concept for Maintaining High Fuzzy Utility Itemsets
    Hong, Tzung-Pei
    Hung, Wei-Teng
    Tsai, Yu-Chuan
    Huang, Wei-Ming
    2023 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, FUZZ, 2023,
  • [24] A Declarative Framework for Mining Top-k High Utility Itemsets
    Hidouri, Amel
    Jabbour, Said
    Raddaoui, Badran
    Chebbah, Mouna
    Ben Yaghlane, Boutheina
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY (DAWAK 2021), 2021, 12925 : 250 - 256
  • [25] Incrementally mining high utility patterns based on pre-large concept
    Chun-Wei Lin
    Tzung-Pei Hong
    Guo-Cheng Lan
    Jia-Wei Wong
    Wen-Yang Lin
    Applied Intelligence, 2014, 40 : 343 - 357
  • [26] Top-k high average-utility itemsets mining with effective pruning strategies
    Ronghui Wu
    Zhan He
    Applied Intelligence, 2018, 48 : 3429 - 3445
  • [27] Top-k high average-utility itemsets mining with effective pruning strategies
    Wu, Ronghui
    He, Zhan
    APPLIED INTELLIGENCE, 2018, 48 (10) : 3429 - 3445
  • [28] Incrementally mining high utility patterns based on pre-large concept
    Lin, Chun-Wei
    Hong, Tzung-Pei
    Lan, Guo-Cheng
    Wong, Jia-Wei
    Lin, Wen-Yang
    APPLIED INTELLIGENCE, 2014, 40 (02) : 343 - 357
  • [29] FTKHUIM: A Fast and Efficient Method for Mining Top-K High-Utility Itemsets
    Vu, Vinh V.
    Lam, Mi T. H.
    Duong, Thuy T. M.
    Manh, Ly T.
    Nguyen, Thuy T. T.
    Nguyen, Le V.
    Yun, Unil
    Snasel, Vaclav
    Vo, Bay
    IEEE ACCESS, 2023, 11 : 104789 - 104805
  • [30] Efficient transaction deleting approach of pre-large based high utility pattern mining in dynamic databases
    Yun, Unil
    Nam, Hyoju
    Kim, Jongseong
    Kim, Heonho
    Baek, Yoonji
    Lee, Judae
    Yoon, Eunchul
    Tin Truong
    Bay Vo
    Pedrycz, Witold
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 103 (103): : 58 - 78