An Improved Algorithm of Decision Trees for Streaming Data Based on VFDT

被引:3
|
作者
Li, Feixiong [1 ]
Liu, Quan [1 ]
机构
[1] Soochow Univ, Prov Key Lab Comp Informat Proc Technol, Suzhou 215006, Peoples R China
关键词
Streaming Data Mining; Decision Trees; Unequal Interval Numerical Pruning(UINP); Naive Bayes Classifiers;
D O I
10.1109/ISISE.2008.256
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Decision tree is a good model of Classification. Recently, there has been much interest in mining streaming data. Because streaming data is large and no limited, it is unpractical that passing the entire data over more than one time. A one pass online algorithm is necessary. One of the most successful algorithms for mining data streams is VFDT(Very Fast Decision Tree). we extend the VFDT system to EVFDT(Efficient-VFDT) in two directions: (1)We present Uneven Interval Numerical Pruning (shortly UINP) approach for efficiently processing numerical attributes. (2)We use naive Bayes classifiers associated with the node to process the samples to detect the outlying samples and reduce the scale of the trees. From the experimental comparison, the two techniques significantly improve the efficiency and the accuracy of decision tree construction on streaming data.
引用
收藏
页码:597 / 600
页数:4
相关论文
共 50 条
  • [41] Input data for decision trees
    Piramuthu, Selwyn
    EXPERT SYSTEMS WITH APPLICATIONS, 2008, 34 (02) : 1220 - 1226
  • [42] Decision trees for probabilistic data
    Aboa, JP
    Emilion, R
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2000, 1874 : 393 - 398
  • [43] An Algorithm for Anticipating Future Decision Trees from Concept-Drifting Data
    Boettcher, Mirko
    Spott, Martin
    Kruse, Rudolf
    RESEARCH AND DEVELOPMENT IN INTELLIGENT SYSTEMS XXV, 2009, : 293 - +
  • [44] Data mining with decision trees and decision rules
    Apte, C
    Weiss, S
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 1997, 13 (2-3): : 197 - 210
  • [45] Decision Trees for Mining Data Streams Based on the Gaussian Approximation
    Rutkowski, Leszek
    Jaworski, Maciej
    Pietruczuk, Lena
    Duda, Piotr
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (01) : 108 - 119
  • [46] Genetic program based data mining for fuzzy decision trees
    Smith, JF
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING IDEAL 2004, PROCEEDINGS, 2004, 3177 : 464 - 470
  • [47] On Algorithm for Building of Optimal α-Decision Trees
    Alkhalid, Abdulaziz
    Chikalov, Igor
    Moshkov, Mikhail
    ROUGH SETS AND CURRENT TRENDS IN COMPUTING, PROCEEDINGS, 2010, 6086 : 438 - 445
  • [48] Join Path-Based Data Augmentation for Decision Trees
    Ionescu, Andra
    Hai, Rihan
    Fragkoulis, Marios
    Katsifodimos, Asterios
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW 2022), 2022, : 84 - 88
  • [49] Combination classification method of multiple decision trees based on genetic algorithm
    Zhang, Zhe
    Chang, Gui-Ran
    Huang, Xiao-Yuan
    Xitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice, 2004, 24 (04):
  • [50] An early warning model of student achievement based on Decision Trees algorithm
    Liu, Wenbo
    Wu, Ji
    Gao, Xiaopeng
    Feng, Kai
    PROCEEDINGS OF 2017 IEEE 6TH INTERNATIONAL CONFERENCE ON TEACHING, ASSESSMENT, AND LEARNING FOR ENGINEERING (TALE), 2017, : 217 - 222