Predicting nearly as well as the best pruning of a decision tree

被引:70
|
作者
Helmbold, DP [1 ]
Schapire, RE [1 ]
机构
[1] AT&T BELL LABS,MURRAY HILL,NJ 07974
关键词
decision trees; pruning; prediction; on-line learning;
D O I
10.1023/A:1007396710653
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many algorithms for inferring a decision tree from data involve a two-phase process: First, a very large decision tree is grown which typically ends up ''over-fitting'' the data. To reduce over-fitting, in the second phase, the tree is pruned using one of a number of available methods. The final tree is then output and used for classification on rest data. In this paper, we suggest an alternative approach to the pruning phase. Using a given unpruned decision tree, we present a new method of making predictions on test data, and we prove that our algorithm's performance will not be ''much worse'' (in a precise technical sense) than the predictions made by the best reasonably small pruning of the given decision tree. Thus, our procedure is guaranteed to be competitive (in terms of the quality of its predictions) with any pruning algorithm. We prove that our procedure is very efficient and highly robust. Our method can be viewed as a synthesis of two previously studied techniques. First, we apply Cesa-Bianchi et al.'s (1993) results on predicting using ''expert advice'' (where we view each pruning as an ''expert'') to obtain an algorithm that has provably low prediction loss, but that is computationally infeasible. Next, we generalize and apply a method developed by Buntine (1990, 1992) and Willems, Shtarkov and Tjalkens (1993, 1995) to derive a very efficient implementation of this procedure.
引用
收藏
页码:51 / 68
页数:18
相关论文
共 50 条
  • [1] Predicting nearly as well as the best pruning of a decision tree
    Univ of California, Santa Cruz, United States
    Mach Learn, 1600, 1 (51-68):
  • [2] Predicting Nearly As Well As the Best Pruning of a Decision Tree
    David P. Helmbold
    Robert E. Schapire
    Machine Learning, 1997, 27 : 51 - 68
  • [3] Predicting nearly as well as the best pruning of a decision tree through dynamic programming scheme
    Takimoto, E
    Maruoka, A
    Vovk, V
    THEORETICAL COMPUTER SCIENCE, 2001, 261 (01) : 179 - 209
  • [4] A simple algorithm for predicting nearly as well as the best pruning labeled with the best prediction values of a decision tree
    Takimoto, E
    Hirai, K
    Maruoka, A
    ALGORITHMIC LEARNING THEORY, 1997, 1316 : 385 - 400
  • [5] Predicting nearly as well as the best pruning of a planar decision graph
    Takimoto, E
    Warmuth, MK
    THEORETICAL COMPUTER SCIENCE, 2002, 288 (02) : 217 - 235
  • [6] Predicting nearly as well as the best pruning of a planar decision graph
    Takimoto, E
    Warmuth, MK
    ALGORITHMIC LEARNING THEORY, PROCEEDINGS, 1999, 1720 : 335 - 346
  • [7] RST in decision tree pruning
    Wei, Jin-Mao
    Wang, Shu-Qin
    You, Jun-Ping
    Wang, Guo-Ying
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 3, PROCEEDINGS, 2007, : 213 - +
  • [8] The biases of decision tree pruning strategies
    Elomaa, T
    ADVANCES IN INTELLIGENT DATA ANALYSIS, PROCEEDINGS, 1999, 1642 : 63 - 74
  • [9] A quality index for decision tree pruning
    Fournier, D
    Crémilleux, B
    KNOWLEDGE-BASED SYSTEMS, 2002, 15 (1-2) : 37 - 43
  • [10] AN INVESTIGATION ON THE CONDITIONS OF PRUNING AN INDUCED DECISION TREE
    KIM, H
    KOEHLER, GJ
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1994, 77 (01) : 82 - 95