Harnessing the Power of Choices in Decision Tree Learning

被引:0
|
作者
Blanc, Guy [1 ]
Lange, Jane [2 ]
Pabbaraju, Chirag [1 ]
Sullivan, Colin [1 ]
Tan, Li-Yang [1 ]
Tiwari, Mo [1 ]
机构
[1] Stanford, Stanford, CA 94305 USA
[2] MIT, Cambridge, MA 02139 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a simple generalization of standard and empirically successful decision tree learning algorithms such as ID3, C4.5, and CART. These algorithms, which have been central to machine learning for decades, are greedy in nature: they grow a decision tree by iteratively splitting on the best attribute. Our algorithm, Top-k, considers the k best attributes as possible splits instead of just the single best attribute.We demonstrate, theoretically and empirically, the power of this simple generalization. We first prove a greediness hierarchy theorem showing that for every k is an element of N, Top-(k + 1) can be dramatically more powerful than Top-k: there are data distributions for which the former achieves accuracy 1 - epsilon, whereas the latter only achieves accuracy 1/2 + epsilon. We then show, through extensive experiments, that Top-k outperforms the two main approaches to decision tree learning: classic greedy algorithms and more recent "optimal decision tree" algorithms. On one hand, Top-k consistently enjoys significant accuracy gains over greedy algorithms across a wide range of benchmarks. On the other hand, Top-k is markedly more scalable than optimal decision tree algorithms and is able to handle dataset and feature set sizes that remain far beyond the reach of these algorithms. The code to reproduce our results is available at: https://github.com/SullivanC19/pydl8.5-topk.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Structural diversity for decision tree ensemble learning
    Tao Sun
    Zhi-Hua Zhou
    Frontiers of Computer Science, 2018, 12 : 560 - 570
  • [42] Bayesian evidence framework for decision tree learning
    Chatpatanasiri, R
    Kijsirikul, B
    Bayesian Inference and Maximum Entropy Methods in Science and Engineering, 2005, 803 : 88 - 95
  • [43] Structure and majority classes in decision tree learning
    School of Computing and Informating Engineering, University of Ulster, Coleraine, Co. Londonderry BT52 1SA, United Kingdom
    J. Mach. Learn. Res., 2007, (1747-1768):
  • [44] Integrating decision tree learning into inductive databases
    Fromont, Elisa
    Blockeel, Hendrik
    Struyf, Jan
    KNOWLEDGE DISCOVERY IN INDUCTIVE DATABASES, 2007, 4747 : 81 - 96
  • [45] An Interactive Web Application for Decision Tree Learning
    Elia, Miriam
    Gajek, Carola
    Schiendorfer, Alexander
    Reif, Wolfgang
    EUROPEAN CONFERENCE ON MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, VOL 141, 2020, 141
  • [46] Harnessing the power of decision-support tools to trigger climate action
    Molina-Perez, Edmundo
    NATURE COMPUTATIONAL SCIENCE, 2023, 3 (06): : 461 - 463
  • [47] Harnessing the power of decision-support tools to trigger climate action
    Edmundo Molina-Perez
    Nature Computational Science, 2023, 3 : 461 - 463
  • [48] Choices for tree conservation
    Oldfield, Sara
    ORYX, 2008, 42 (02) : 159 - 160
  • [49] Harnessing the Power of Ensemble Machine Learning for the Heart Stroke Classification
    Pal P.
    Nandal M.
    Dikshit S.
    Thusu A.
    Singh H.V.
    EAI Endorsed Transactions on Pervasive Health and Technology, 2023, 9 (01)
  • [50] Harnessing the Power of Social Learning: Exploring Educators' and Students' Perceptions
    Tung, Tran Minh
    Lan, Duong Hoai
    Cuc, Tran Thi Kim
    Oanh, Vo Thi Kim
    Ngoc, Ngo Bich
    JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (03) : 1487 - 1507