Instance-Based Uncertainty Estimation for Gradient-Boosted Regression Trees

被引:0
|
作者
Brophy, Jonathan [1 ]
Lowd, Daniel [1 ]
机构
[1] Univ Oregon, Eugene, OR 97403 USA
关键词
MACHINE; PERFORMANCE; PREDICTION; TUTORIAL; FORESTS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gradient-boosted regression trees (GBRTs) are hugely popular for solving tabular regression problems, but provide no estimate of uncertainty. We propose Instance-Based Uncertainty estimation for Gradient-boosted regression trees (IBUG), a simple method for extending any GBRT point predictor to produce probabilistic predictions. IBUG computes a non-parametric distribution around a prediction using the k-nearest training instances, where distance is measured with a tree-ensemble kernel. The runtime of IBUG depends on the number of training examples at each leaf in the ensemble, and can be improved by sampling trees or training instances. Empirically, we find that IBUG achieves similar or better performance than the previous state-of-the-art across 22 benchmark regression datasets. We also find that IBUG can achieve improved probabilistic performance by using different base GBRT models, and can more flexibly model the posterior distribution of a prediction than competing methods. We also find that previous methods suffer from poor probabilistic calibration on some datasets, which can be mitigated using a scalar factor tuned on the validation data. Source code is available at https://github.com/jjbrophy47/ibug.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Measurement of particle size in suspension based on VIS-NIR-RGB sensors and gradient-boosted regression tree
    Wang, Yanyan
    Zhang, Kaikai
    Shi, Shengzhe
    Wang, Qingqing
    Wang, Chun
    Liu, Sheng
    MEASUREMENT, 2025, 240
  • [22] Flexible loss functions for binary classification in gradient-boosted decision trees: An application to credit scoring
    Mushava, Jonah
    Murray, Michael
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
  • [23] A Rule Extraction Technique Applied to Ensembles of Neural Networks, Random Forests, and Gradient-Boosted Trees
    Bologna, Guido
    ALGORITHMS, 2021, 14 (12)
  • [24] InstanceSHAP: an instance-based estimation approach for Shapley values
    Babaei G.
    Giudici P.
    Behaviormetrika, 2024, 51 (1) : 425 - 439
  • [25] Ensemble width estimation in HRTF-convolved binaural music recordings using an auditory model and a gradient-boosted decision trees regressor
    Antoniuk, Pawel
    Zielinski, Slawomir K.
    Lee, Hyunkook
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01):
  • [26] Estimation of real-driving emissions for buses fueled with liquefied natural gas based on gradient boosted regression trees
    Pan, Yingjiu
    Chen, Shuyan
    Qiao, Fengxiang
    Ukkusuri, Satish, V
    Tang, Kun
    SCIENCE OF THE TOTAL ENVIRONMENT, 2019, 660 : 741 - 750
  • [27] Combining instance-based learning and logistic regression for multilabel classification
    Cheng, Weiwei
    Huellermeier, Eyke
    MACHINE LEARNING, 2009, 76 (2-3) : 211 - 225
  • [28] Combining Instance-Based Learning and Logistic Regression for Multilabel Classification
    Cheng, Weiwei
    Huellermeier, Eyke
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2009, 5781 : 6 - 6
  • [29] Combining instance-based learning and logistic regression for multilabel classification
    Weiwei Cheng
    Eyke Hüllermeier
    Machine Learning, 2009, 76 : 211 - 225
  • [30] IBLStreams: a system for instance-based classification and regression on data streams
    Shaker, Ammar
    Huellermeier, Eyke
    EVOLVING SYSTEMS, 2012, 3 (04) : 235 - 249