Instance-Based Uncertainty Estimation for Gradient-Boosted Regression Trees

被引:0
|
作者
Brophy, Jonathan [1 ]
Lowd, Daniel [1 ]
机构
[1] Univ Oregon, Eugene, OR 97403 USA
关键词
MACHINE; PERFORMANCE; PREDICTION; TUTORIAL; FORESTS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gradient-boosted regression trees (GBRTs) are hugely popular for solving tabular regression problems, but provide no estimate of uncertainty. We propose Instance-Based Uncertainty estimation for Gradient-boosted regression trees (IBUG), a simple method for extending any GBRT point predictor to produce probabilistic predictions. IBUG computes a non-parametric distribution around a prediction using the k-nearest training instances, where distance is measured with a tree-ensemble kernel. The runtime of IBUG depends on the number of training examples at each leaf in the ensemble, and can be improved by sampling trees or training instances. Empirically, we find that IBUG achieves similar or better performance than the previous state-of-the-art across 22 benchmark regression datasets. We also find that IBUG can achieve improved probabilistic performance by using different base GBRT models, and can more flexibly model the posterior distribution of a prediction than competing methods. We also find that previous methods suffer from poor probabilistic calibration on some datasets, which can be mitigated using a scalar factor tuned on the validation data. Source code is available at https://github.com/jjbrophy47/ibug.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Euclid: Identifying the reddest high-redshift galaxies in the Euclid Deep Fields with gradient-boosted trees
    Signor, T.
    Rodighiero, G.
    Bisigello, L.
    Bolzonella, M.
    Caputi, K. I.
    Daddi, E.
    De Lucia, G.
    Enia, A.
    Gabarra, L.
    Gruppioni, C.
    Humphrey, A.
    La Franca, F.
    Mancini, C.
    Pozzetti, L.
    Serjeant, S.
    Spinoglio, L.
    Van Mierlo, S. E.
    Andreon, S.
    Auricchio, N.
    Baldi, M.
    Bardelli, S.
    Battaglia, P.
    Bender, R.
    Bodendorf, C.
    Bonino, D.
    Branchini, E.
    Brescia, M.
    Brinchmann, J.
    Camera, S.
    Capobianco, V.
    Carbone, C.
    Carretero, J.
    Casas, S.
    Castellano, M.
    Cavuoti, S.
    Cimatti, A.
    Cledassou, R.
    Congedo, G.
    Conselice, C. J.
    Conversi, L.
    Copin, Y.
    Corcione, L.
    Courbin, F.
    Courtois, H. M.
    Da Silva, A.
    Degaudenzi, H.
    Di Giorgio, A. M.
    Dinis, J.
    Dubath, F.
    Dupac, X.
    ASTRONOMY & ASTROPHYSICS, 2024, 685
  • [32] Back-Analysis Method for Stope Displacements Using Gradient-Boosted Regression Tree and Firefly Algorithm
    Qi, Chongchong
    Fourie, Andy
    Zhao, Xu
    JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 2018, 32 (05)
  • [33] Evaluating soccer match prediction models: a deep learning approach and feature optimization for gradient-boosted trees
    Yeung, Calvin
    Bunker, Rory
    Umemoto, Rikuhei
    Fujii, Keisuke
    MACHINE LEARNING, 2024, 113 (10) : 7541 - 7564
  • [34] Instance-based transfer learning for soil organic carbon estimation
    Bursac, Petar
    Kovacevic, Milos
    Bajat, Branislav
    FRONTIERS IN ENVIRONMENTAL SCIENCE, 2022, 10
  • [35] Forecasting PM2.5 Concentration Using Gradient-Boosted Regression Tree with CNN Learning Model
    Usha Ruby, A.
    Chandran, J. George Chellin
    Theerthagiri, Prasannavenkatesan
    Patil, Renuka
    Chaithanya, B. N.
    Jain, T. J. Swasthika
    OPTICAL MEMORY AND NEURAL NETWORKS, 2024, 33 (01) : 86 - 96
  • [36] Instance-based regression with missing data applied to a photocatalytic oxidation process
    Leon, Florin
    Piuleac, Ciprian George
    Curteanu, Silvia
    Poulios, Ioannis
    CENTRAL EUROPEAN JOURNAL OF CHEMISTRY, 2012, 10 (04): : 1149 - 1156
  • [37] Correction: InstanceSHAP: an instance-based estimation approach for Shapley values
    Golnoosh Babaei
    Paolo Giudici
    Behaviormetrika, 2024, 51 (2) : 681 - 681
  • [38] Estimation Algorithm of Butyrylcholinesterase for Cirrhosis using Instance-based Reasoning
    Hatakeyama, Yutaka
    Nakajima, Noriaki
    Watabe, Teruaki
    Okuhara, Yoshiyasu
    2008 WORLD AUTOMATION CONGRESS PROCEEDINGS, VOLS 1-3, 2008, : 167 - 172
  • [39] Probabilistic Wind Power Forecasting Approach via Instance-Based Transfer Learning Embedded Gradient Boosting Decision Trees
    Cai, Long
    Gu, Jie
    Ma, Jinghuan
    Jin, Zhijian
    ENERGIES, 2019, 12 (01)
  • [40] Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500
    Krauss, Christopher
    Xuan Anh Do
    Huck, Nicolas
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2017, 259 (02) : 689 - 702