Incremental Optimization Mechanism for Constructing a Decision Tree in Data Stream Mining

被引:16
|
作者
Yang, Hang [1 ]
Fong, Simon [1 ]
机构
[1] Univ Macau, Fac Sci & Technol, Dept Comp & Informat Sci, Taipa, Peoples R China
关键词
D O I
10.1155/2013/580397
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Imperfect data stream leads to tree size explosion and detrimental accuracy problems. Overfitting problem and the imbalanced class distribution reduce the performance of the original decision-tree algorithm for stream mining. In this paper, we propose an incremental optimization mechanism to solve these problems. The mechanism is called Optimized Very Fast Decision Tree (OVFDT) that possesses an optimized node-splitting control mechanism. Accuracy, tree size, and the learning time are the significant factors influencing the algorithm's performance. Naturally a bigger tree size takes longer computation time. OVFDT is a pioneer model equipped with an incremental optimization mechanism that seeks for a balance between accuracy and tree size for data stream mining. It operates incrementally by a test-then-train approach. Three types of functional tree leaves improve the accuracy with which the tree model makes a prediction for a new data stream in the testing phase. The optimized node-splitting mechanism controls the tree model growth in the training phase. The experiment shows that OVFDT obtains an optimal tree structure in both numeric and nominal datasets.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] AN INCREMENTAL DECISION TREE FOR MINING MULTILABEL DATA
    Li, Peipei
    Wu, Xindong
    Hu, Xuegang
    Wang, Hao
    APPLIED ARTIFICIAL INTELLIGENCE, 2015, 29 (10) : 992 - 1014
  • [2] A Statistical Decision Tree Algorithm for Medical Data Stream Mining
    Cazzolato, Mirela Teixeira
    Ribeiro, Marcela Xavier
    2013 IEEE 26TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS), 2013, : 389 - 392
  • [3] Enhancement of Very Fast Decision Tree for Data Stream Mining
    Lefa, Mai
    Abd-Elkader, Hatem
    Salem, Rashed
    STUDIES IN INFORMATICS AND CONTROL, 2022, 31 (02): : 49 - 60
  • [4] Efficient Incremental Itemset Tree for Approximate Frequent Itemset Mining On Data Stream
    Bai, Pavitra S.
    Kumar, Ravi G. K.
    PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON APPLIED AND THEORETICAL COMPUTING AND COMMUNICATION TECHNOLOGY (ICATCCT), 2016, : 239 - 242
  • [5] An incremental fuzzy decision tree classification method for mining data streams
    Wang, Tao
    Li, Zhoujun
    Yan, Yuejin
    Chen, Huowang
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, PROCEEDINGS, 2007, 4571 : 91 - +
  • [6] Comparative Study of Various Decision Tree Methods for Data Stream Mining
    Mehta, Vaishali
    Sanghavi, Vishakha
    THIRD INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, 2019, 797 : 371 - 379
  • [7] AN INCREMENTAL DECISION TREE LEARNING METHODOLOGY REGARDING ATTRIBUTES IN MEDICAL DATA MINING
    Chao, Sam
    Wong, Fai
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 1694 - 1699
  • [8] A novel incremental approach for stream data mining
    Aboalsamh, Hatim A.
    AEJ - Alexandria Engineering Journal, 2009, 48 (04): : 419 - 426
  • [9] Strict Very Fast Decision Tree: A memory conservative algorithm for data stream mining
    Turrisi da Costa, Victor Guilherme
    Ponce de Leon Ferreira de Carvalho, Andre Carlos
    Barbon Junior, Sylvio
    PATTERN RECOGNITION LETTERS, 2018, 116 : 22 - 28
  • [10] Incremental Learning Framework for Mining Big Data Stream
    Eisa, Alaa
    EL-Rashidy, Nora
    Alshehri, Mohammad Dahman
    El-bakry, Hazem M.
    Abdelrazek, Samir
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 71 (02): : 2901 - 2921