Poisson Dependency Networks: Gradient Boosted Models for Multivariate Count Data

被引:15
|
作者
Hadiji, Fabian [1 ]
Molina, Alejandro [1 ]
Natarajan, Sriraam [2 ]
Kersting, Kristian [1 ]
机构
[1] TU Dortmund Univ, LS 8, Dortmund, Germany
[2] Indiana Univ, Sch Informat & Comp, Bloomington, IN USA
关键词
Graphical models; Dependency networks; Poisson distribution; Learning; MAP inference; ALGORITHM; SELECTION; IMAGES; GUIDE;
D O I
10.1007/s10994-015-5506-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although count data are increasingly ubiquitous, surprisingly little work has employed probabilistic graphical models for modeling count data. Indeed the univariate case has been well studied, however, in many situations counts influence each other and should not be considered independently. Standard graphical models such as multinomial or Gaussian ones are also often ill-suited, too, since they disregard either the infinite range over the natural numbers or the potentially asymmetric shape of the distribution of count variables. Existing classes of Poisson graphical models can only model negative conditional dependencies or neglect the prediction of counts or do not scale well. To ease the modeling of multivariate count data, we therefore introduce a novel family of Poisson graphical models, called Poisson Dependency Networks (PDNs). A PDN consists of a set of local conditional Poisson distributions, each representing the probability of a single count variable given the others, that naturally facilitates a simple Gibbs sampling inference. In contrast to existing Poisson graphical models, PDNs are non-parametric and trained using functional gradient ascent, i.e., boosting. The particularly simple form of the Poisson distribution allows us to develop the first multiplicative boosting approach: starting from an initial constant value, alternatively a log-linear Poisson model, or a Poisson regression tree, a PDN is represented as products of regression models grown in a stage-wise optimization. We demonstrate on several real world datasets that PDNs can model positive and negative dependencies and scale well while often outperforming state-of-the-art, in particular when using multiplicative updates.
引用
收藏
页码:477 / 507
页数:31
相关论文
共 50 条
  • [41] Poisson PCA for matrix count data
    Virta, Joni
    Artemiou, Andreas
    PATTERN RECOGNITION, 2023, 138
  • [42] Hurdle-QAP models overcome dependency and sparsity in scientific collaboration count networks
    Marchi, Hannah
    Fuchs, Christiane
    JOURNAL OF MATHEMATICAL SOCIOLOGY, 2024, 48 (01): : 100 - 127
  • [43] Bayesian multivariate Poisson abundance models for T-cell receptor data
    Greene, Joshua
    Birtwistle, Marc R.
    Ignatowicz, Leszek
    Rempala, Grzegorz A.
    JOURNAL OF THEORETICAL BIOLOGY, 2013, 326 : 1 - 10
  • [44] GBGNN: Gradient Boosted Graph Neural Networks
    Jang, Eunjo
    Lee, Ki Yong
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2024, 20 (04): : 501 - 513
  • [45] Generalized Linear Latent Variable Models for Multivariate Count and Biomass Data in Ecology
    Niku, Jenni
    Warton, David I.
    Hui, Francis K. C.
    Taskinen, Sara
    JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS, 2017, 22 (04) : 498 - 522
  • [46] Generalized Linear Latent Variable Models for Multivariate Count and Biomass Data in Ecology
    Jenni Niku
    David I. Warton
    Francis K. C. Hui
    Sara Taskinen
    Journal of Agricultural, Biological and Environmental Statistics, 2017, 22 : 498 - 522
  • [47] Poisson QMLE of Count Time Series Models
    Ahmad, Ali
    Francq, Christian
    JOURNAL OF TIME SERIES ANALYSIS, 2016, 37 (03) : 291 - 314
  • [48] RUMBoost: Gradient boosted random utility models
    Salvade, Nicolas
    Hillel, Tim
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2025, 170
  • [49] Semiparametric transformation models for multivariate panel count data with dependent observation process
    Li, Ni
    Park, Do-Hwan
    Sun, Jianguo
    Kim, KyungMann
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2011, 39 (03): : 458 - 474
  • [50] Multivariate count data regression models with individual panel data from an on-site sample
    Egan, Kevin
    Herriges, Joseph
    JOURNAL OF ENVIRONMENTAL ECONOMICS AND MANAGEMENT, 2006, 52 (02) : 567 - 581