Poisson Dependency Networks: Gradient Boosted Models for Multivariate Count Data

被引:15
|
作者
Hadiji, Fabian [1 ]
Molina, Alejandro [1 ]
Natarajan, Sriraam [2 ]
Kersting, Kristian [1 ]
机构
[1] TU Dortmund Univ, LS 8, Dortmund, Germany
[2] Indiana Univ, Sch Informat & Comp, Bloomington, IN USA
关键词
Graphical models; Dependency networks; Poisson distribution; Learning; MAP inference; ALGORITHM; SELECTION; IMAGES; GUIDE;
D O I
10.1007/s10994-015-5506-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although count data are increasingly ubiquitous, surprisingly little work has employed probabilistic graphical models for modeling count data. Indeed the univariate case has been well studied, however, in many situations counts influence each other and should not be considered independently. Standard graphical models such as multinomial or Gaussian ones are also often ill-suited, too, since they disregard either the infinite range over the natural numbers or the potentially asymmetric shape of the distribution of count variables. Existing classes of Poisson graphical models can only model negative conditional dependencies or neglect the prediction of counts or do not scale well. To ease the modeling of multivariate count data, we therefore introduce a novel family of Poisson graphical models, called Poisson Dependency Networks (PDNs). A PDN consists of a set of local conditional Poisson distributions, each representing the probability of a single count variable given the others, that naturally facilitates a simple Gibbs sampling inference. In contrast to existing Poisson graphical models, PDNs are non-parametric and trained using functional gradient ascent, i.e., boosting. The particularly simple form of the Poisson distribution allows us to develop the first multiplicative boosting approach: starting from an initial constant value, alternatively a log-linear Poisson model, or a Poisson regression tree, a PDN is represented as products of regression models grown in a stage-wise optimization. We demonstrate on several real world datasets that PDNs can model positive and negative dependencies and scale well while often outperforming state-of-the-art, in particular when using multiplicative updates.
引用
收藏
页码:477 / 507
页数:31
相关论文
共 50 条
  • [21] Multivariate generalized linear mixed models for underdispersed count data
    da Silva, Guilherme Parreira
    Laureano, Henrique Aparecido
    Petterle, Ricardo Rasmussen
    Ribeiro Jr, Paulo Justiniano
    Bonat, Wagner Hugo
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2023, 93 (14) : 2410 - 2427
  • [22] Insurance Premium Prediction via Gradient Tree-Boosted Tweedie Compound Poisson Models
    Yang, Yi
    Qian, Wei
    Zou, Hui
    JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2018, 36 (03) : 456 - 470
  • [23] The multivariate component zero-inflated Poisson model for correlated count data analysis
    Wu, Qin
    Tian, Guo-Liang
    Li, Tao
    Tang, Man-Lai
    Zhang, Chi
    AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, 2023, 65 (03) : 234 - 261
  • [24] Semi-parametric extended Poisson process models for count data
    Podlich, HM
    Faddy, MJ
    Smyth, GK
    STATISTICS AND COMPUTING, 2004, 14 (04) : 311 - 321
  • [25] Semi-parametric extended Poisson process models for count data
    Heather M. Podlich
    Malcolm J. Faddy
    Gordon K. Smyth
    Statistics and Computing, 2004, 14 : 311 - 321
  • [26] Conway-Maxwell-Poisson regression models for dispersed count data
    Sellers, Kimberly F.
    Premeaux, Bailey
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2021, 13 (06)
  • [27] Influence analysis for count data based on generalized Poisson regression models
    Xie, Feng-Chang
    Wei, Bo-Cheng
    STATISTICS, 2010, 44 (04) : 341 - 360
  • [28] Extended Poisson-Tweedie: Properties and regression models for count data
    Bonat, Wagner H.
    Jorgensen, Bent
    Kokonendji, Celestin C.
    Hinde, John
    Demetrio, Clarice G. B.
    STATISTICAL MODELLING, 2018, 18 (01) : 24 - 49
  • [29] Verifying Robustness of Gradient Boosted Models
    Einziger, Gil
    Goldstein, Maayan
    Sa'ar, Yaniv
    Segall, Itai
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 2446 - 2453
  • [30] Boosted multivariate trees for longitudinal data
    Amol Pande
    Liang Li
    Jeevanantham Rajeswaran
    John Ehrlinger
    Udaya B. Kogalur
    Eugene H. Blackstone
    Hemant Ishwaran
    Machine Learning, 2017, 106 : 277 - 305