Probabilistic Gradient Boosting Machines for Large-Scale Probabilistic Regression

被引：18

作者：

Sprangers, Olivier ^{[1
,2
]}

Schelter, Sebastian ^{[2
]}

de Rijke, Maarten ^{[2
]}

机构：

[1] AIRLab, Amsterdam, Netherlands

[2] Univ Amsterdam, Amsterdam, Netherlands

来源：

KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING | 2021年

关键词：

Probabilistic Regression; Gradient Boosting Machines;

D O I：

10.1145/3447548.3467278

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Gradient Boosting Machines (GBMs) are hugely popular for solving tabular data problems. However, practitioners are not only interested in point predictions, but also in probabilistic predictions in order to quantify the uncertainty of the predictions. Creating such probabilistic predictions is difficult with existing GBM-based solutions: they either require training multiple models or they become too computationally expensive to be useful for large-scale settings. We propose Probabilistic Gradient Boosting Machines (PGBMs), a method to create probabilistic predictions with a single ensemble of decision trees in a computationally efficient manner. PGBM approximates the leaf weights in a decision tree as a random variable, and approximates the mean and variance of each sample in a dataset via stochastic tree ensemble update equations. These learned moments allow us to subsequently sample from a specified distribution after training. We empirically demonstrate the advantages of PGBM compared to existing state-of-the-art methods: (i) PGBM enables probabilistic estimates without compromising on point performance in a single model, (ii) PGBM learns probabilistic estimates via a single model only (and without requiring multi-parameter boosting), and thereby offers a speedup of up to several orders of magnitude over existing state-of-the-art methods on large datasets, and (iii) PGBM achieves accurate probabilistic estimates in tasks with complex differentiable loss functions, such as hierarchical time series problems, where we observed up to 10% improvement in point forecasting performance and up to 300% improvement in probabilistic forecasting performance.

引用

页码：1510 / 1520

页数：11

共 50 条

[1] A Large-Scale Study of Probabilistic Calibration in Neural Network Regression
Dheur, Victor
Ben Taieb, Souhaib
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
[2] Probabilistic gradient boosting machines for GEFCom2014 wind forecasting
Landry, Mark
Edinger, Thomas P.
Patschke, David
Varrichio, Craig
INTERNATIONAL JOURNAL OF FORECASTING, 2016, 32 (03) : 1061 - 1066
[3] Probabilistic queries in large-scale networks
Pedone, F
Duarte, NL
Goulart, M
DEPENDABLE COMPUTING: EDCC-4, PROCEEDINGS, 2002, 2485 : 209 - 226
[4] Towards Large-Scale Probabilistic OBDA
Schoenfisch, Joerg
Stuckenschmidt, Heiner
SCALABLE UNCERTAINTY MANAGEMENT (SUM 2015), 2015, 9310 : 106 - 120
[5] Mechanisms of probabilistic cueing in large-scale search
Smith, A. D.
Hood, B. M.
Gilchrist, I. D.
PERCEPTION, 2007, 36 (09) : 1402 - 1402
[6] Probabilistic Cuing in Large-Scale Environmental Search
Smith, Alastair D.
Hood, Bruce M.
Gilchrist, Iain D.
JOURNAL OF EXPERIMENTAL PSYCHOLOGY-LEARNING MEMORY AND COGNITION, 2010, 36 (03) : 605 - 618
[7] Probabilistic reliable dissemination in large-scale systems
Kermarrec, AM
Massoulié, L
Ganesh, AJ
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2003, 14 (03) : 248 - 258
[8] NON-PROBABILISTIC AGGREGATION OR PROBABILISTIC SAMPLING IN LARGE-SCALE PLANT LOCATION PROBLEMS
HOLROYD, WM
AMERICAN JOURNAL OF AGRICULTURAL ECONOMICS, 1974, 56 (05) : 1206 - 1206
[9] Probabilistic Belief Embedding for Large-Scale Knowledge Population
Fan, Miao
Zhou, Qiang
Abel, Andrew
Zheng, Thomas Fang
Grishman, Ralph
COGNITIVE COMPUTATION, 2016, 8 (06) : 1087 - 1102
[10] Large-scale probabilistic predictors with and without guarantees of validity
Vovk, Vladimir
Petej, Ivan
Fedorova, Valentina
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28

← 1 2 3 4 5 →