Probabilistic Gradient Boosting Machines for Large-Scale Probabilistic Regression

被引:18
|
作者
Sprangers, Olivier [1 ,2 ]
Schelter, Sebastian [2 ]
de Rijke, Maarten [2 ]
机构
[1] AIRLab, Amsterdam, Netherlands
[2] Univ Amsterdam, Amsterdam, Netherlands
关键词
Probabilistic Regression; Gradient Boosting Machines;
D O I
10.1145/3447548.3467278
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gradient Boosting Machines (GBMs) are hugely popular for solving tabular data problems. However, practitioners are not only interested in point predictions, but also in probabilistic predictions in order to quantify the uncertainty of the predictions. Creating such probabilistic predictions is difficult with existing GBM-based solutions: they either require training multiple models or they become too computationally expensive to be useful for large-scale settings. We propose Probabilistic Gradient Boosting Machines (PGBMs), a method to create probabilistic predictions with a single ensemble of decision trees in a computationally efficient manner. PGBM approximates the leaf weights in a decision tree as a random variable, and approximates the mean and variance of each sample in a dataset via stochastic tree ensemble update equations. These learned moments allow us to subsequently sample from a specified distribution after training. We empirically demonstrate the advantages of PGBM compared to existing state-of-the-art methods: (i) PGBM enables probabilistic estimates without compromising on point performance in a single model, (ii) PGBM learns probabilistic estimates via a single model only (and without requiring multi-parameter boosting), and thereby offers a speedup of up to several orders of magnitude over existing state-of-the-art methods on large datasets, and (iii) PGBM achieves accurate probabilistic estimates in tasks with complex differentiable loss functions, such as hierarchical time series problems, where we observed up to 10% improvement in point forecasting performance and up to 300% improvement in probabilistic forecasting performance.
引用
收藏
页码:1510 / 1520
页数:11
相关论文
共 50 条
  • [41] A Probabilistic Eulerian Approach for Motion Planning of a Large-Scale Swarm of Robots
    Bandyopadhyay, Saptarshi
    Chung, Soon-Jo
    Hadaegh, Fred Y.
    2016 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2016), 2016, : 3822 - 3829
  • [42] Large-scale neuronal network models based on a probabilistic connectivity principle
    Tóth, TI
    Crunelli, V
    JOURNAL OF PHYSIOLOGY-LONDON, 2002, 544 : 76P - 76P
  • [43] Localized and Incremental Probabilistic Inference for Large-Scale Networked Dynamical Systems
    Matsuka, Kai
    Chung, Soon-Jo
    IEEE TRANSACTIONS ON ROBOTICS, 2023, 39 (05) : 3516 - 3535
  • [44] Large-scale Data Integration Using Graph Probabilistic Dependencies (GPDs)
    Zada, Muhammad Sadiq Hassan
    Yuan, Bo
    Anjum, Ashiq
    Azad, Muhammad Ajmal
    Khan, Wajahat Ali
    Reiff-Marganiec, Stephan
    2020 IEEE/ACM INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING, APPLICATIONS AND TECHNOLOGIES (BDCAT 2020), 2020, : 27 - 36
  • [45] Probabilistic Analysis of Counting Protocols in Large-scale Asynchronous and Anonymous Systems
    Mocquard, Yves
    Sericola, Bruno
    Anceaume, Emmanuelle
    2017 IEEE 16TH INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS (NCA), 2017, : 315 - 322
  • [46] Physical Layer Security in Large-Scale Probabilistic Caching: Analysis and Optimization
    Zhang, Shubin
    Sun, Wen
    Liu, Jiajia
    Nei, Kato
    IEEE COMMUNICATIONS LETTERS, 2019, 23 (09) : 1484 - 1487
  • [47] Probabilistic Analysis of a Large-Scale Urban Traffic Sensor Data Set
    Hutchins, Jon
    Ihler, Alexander
    Smyth, Padhraic
    KNOWLEDGE DISCOVERY FROM SENSOR DATA, 2010, 5840 : 94 - 114
  • [48] Probabilistic Evaluation of Oscillatory Stability Margin with Large-Scale Wind Generation
    Yue, Hao
    Li, Gengyin
    Zhou, Ming
    Wei, Junqiang
    2013 IEEE PES ASIA-PACIFIC POWER AND ENERGY ENGINEERING CONFERENCE (APPEEC), 2013,
  • [49] Solving large-scale multiobjective optimization via the probabilistic prediction model
    Haokai Hong
    Kai Ye
    Min Jiang
    Donglin Cao
    Kay Chen Tan
    Memetic Computing, 2022, 14 : 165 - 177
  • [50] Probabilistic, entropy-maximizing control of large-scale neural synchronization
    Menceloglu, Melisa
    Grabowecky, Marcia
    Suzuki, Satoru
    PLOS ONE, 2021, 16 (04):