Harnessing the Power of Choices in Decision Tree Learning

被引：0

作者：

Blanc, Guy ^{[1
]}

Lange, Jane ^{[2
]}

Pabbaraju, Chirag ^{[1
]}

Sullivan, Colin ^{[1
]}

Tan, Li-Yang ^{[1
]}

Tiwari, Mo ^{[1
]}

机构：

[1] Stanford, Stanford, CA 94305 USA

[2] MIT, Cambridge, MA 02139 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose a simple generalization of standard and empirically successful decision tree learning algorithms such as ID3, C4.5, and CART. These algorithms, which have been central to machine learning for decades, are greedy in nature: they grow a decision tree by iteratively splitting on the best attribute. Our algorithm, Top-k, considers the k best attributes as possible splits instead of just the single best attribute.We demonstrate, theoretically and empirically, the power of this simple generalization. We first prove a greediness hierarchy theorem showing that for every k is an element of N, Top-(k + 1) can be dramatically more powerful than Top-k: there are data distributions for which the former achieves accuracy 1 - epsilon, whereas the latter only achieves accuracy 1/2 + epsilon. We then show, through extensive experiments, that Top-k outperforms the two main approaches to decision tree learning: classic greedy algorithms and more recent "optimal decision tree" algorithms. On one hand, Top-k consistently enjoys significant accuracy gains over greedy algorithms across a wide range of benchmarks. On the other hand, Top-k is markedly more scalable than optimal decision tree algorithms and is able to handle dataset and feature set sizes that remain far beyond the reach of these algorithms. The code to reproduce our results is available at: https://github.com/SullivanC19/pydl8.5-topk.

引用

页数：13

共 50 条

[41] Structural diversity for decision tree ensemble learning
Tao Sun
Zhi-Hua Zhou
Frontiers of Computer Science, 2018, 12 : 560 - 570
[42] Bayesian evidence framework for decision tree learning
Chatpatanasiri, R
Kijsirikul, B
Bayesian Inference and Maximum Entropy Methods in Science and Engineering, 2005, 803 : 88 - 95
[43] Structure and majority classes in decision tree learning
School of Computing and Informating Engineering, University of Ulster, Coleraine, Co. Londonderry BT52 1SA, United Kingdom
J. Mach. Learn. Res., 2007, (1747-1768):
[44] Integrating decision tree learning into inductive databases
Fromont, Elisa
Blockeel, Hendrik
Struyf, Jan
KNOWLEDGE DISCOVERY IN INDUCTIVE DATABASES, 2007, 4747 : 81 - 96
[45] An Interactive Web Application for Decision Tree Learning
Elia, Miriam
Gajek, Carola
Schiendorfer, Alexander
Reif, Wolfgang
EUROPEAN CONFERENCE ON MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, VOL 141, 2020, 141
[46] Harnessing the power of decision-support tools to trigger climate action
Molina-Perez, Edmundo
NATURE COMPUTATIONAL SCIENCE, 2023, 3 (06): : 461 - 463
[47] Harnessing the power of decision-support tools to trigger climate action
Edmundo Molina-Perez
Nature Computational Science, 2023, 3 : 461 - 463
[48] Choices for tree conservation
Oldfield, Sara
ORYX, 2008, 42 (02) : 159 - 160
[49] Harnessing the Power of Ensemble Machine Learning for the Heart Stroke Classification
Pal P.
Nandal M.
Dikshit S.
Thusu A.
Singh H.V.
EAI Endorsed Transactions on Pervasive Health and Technology, 2023, 9 (01)
[50] Harnessing the Power of Social Learning: Exploring Educators' and Students' Perceptions
Tung, Tran Minh
Lan, Duong Hoai
Cuc, Tran Thi Kim
Oanh, Vo Thi Kim
Ngoc, Ngo Bich
JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (03) : 1487 - 1507

← 1 2 3 4 5 →