Boosting multivariate structured additive distributional regression models

被引:5
|
作者
Stroemer, Annika [1 ]
Klein, Nadja [2 ,3 ]
Staerk, Christian [1 ]
Klinkhammer, Hannah [1 ,4 ]
Mayr, Andreas [1 ]
机构
[1] Univ Hosp Bonn, Dept Med Biometr Informat & Epidemiol, Bonn, Germany
[2] Tech Univ Dortmund, Chair Uncertainty Quantificat & Stat Learning, Res Ctr Trustworthy Data Sci & Secur UA Ruhr, Dortmund, Germany
[3] Tech Univ Dortmund, Dept Stat, Dortmund, Germany
[4] Univ Hosp Bonn, Inst Genom Stat & Bioinformat, Bonn, Germany
关键词
generalized additive models for location; scale and shape; model-based boosting; multivariate Gaussian distribution; multivariate logit model; multivariate Poisson distribution; semiparametric regression; VARIABLE SELECTION; POISSON REGRESSION; R PACKAGE; BIVARIATE; REGULARIZATION; ALGORITHMS;
D O I
10.1002/sim.9699
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We develop a model-based boosting approach for multivariate distributional regression within the framework of generalized additive models for location, scale, and shape. Our approach enables the simultaneous modeling of all distribution parameters of an arbitrary parametric distribution of a multivariate response conditional on explanatory variables, while being applicable to potentially high-dimensional data. Moreover, the boosting algorithm incorporates data-driven variable selection, taking various different types of effects into account. As a special merit of our approach, it allows for modeling the association between multiple continuous or discrete outcomes through the relevant covariates. After a detailed simulation study investigating estimation and prediction performance, we demonstrate the full flexibility of our approach in three diverse biomedical applications. The first is based on high-dimensional genomic cohort data from the UK Biobank, considering a bivariate binary response (chronic ischemic heart disease and high cholesterol). Here, we are able to identify genetic variants that are informative for the association between cholesterol and heart disease. The second application considers the demand for health care in Australia with the number of consultations and the number of prescribed medications as a bivariate count response. The third application analyses two dimensions of childhood undernutrition in Nigeria as a bivariate response and we find that the correlation between the two undernutrition scores is considerably different depending on the child's age and the region the child lives in.
引用
收藏
页码:1779 / 1801
页数:23
相关论文
共 50 条
  • [21] Distributional Random Forests: Heterogeneity Adjustment and Multivariate Distributional Regression
    Ćevid, Domagoj
    Michel, Loris
    Näf, Jeffrey
    Bühlmann, Peter
    Meinshausen, Nicolai
    Journal of Machine Learning Research, 2022, 23
  • [22] Distributional Random Forests: Heterogeneity Adjustment and Multivariate Distributional Regression
    Cevid, Domagoj
    Michel, Loris
    Naf, Jeffrey
    Buhlmann, Peter
    Meinshausen, Nicolai
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [23] Predicting the occurrence of wildfires with binary structured additive regression models
    Rios-Pena, Laura
    Kneib, Thomas
    Cadarso-Suarez, Carmen
    Marey-Perez, Manuel
    JOURNAL OF ENVIRONMENTAL MANAGEMENT, 2017, 187 : 154 - 165
  • [24] Applications of Multilevel Structured Additive Regression Models to Insurance Data
    Lang, Stefan
    Umlauf, Nikolaus
    COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 155 - 164
  • [25] Density regression via Dirichlet process mixtures of normal structured additive regression models
    Rodriguez-alvarez, Maria Xose
    Inacio, Vanda
    Klein, Nadja
    STATISTICS AND COMPUTING, 2025, 35 (02)
  • [26] Multilevel structured additive regression
    Lang, Stefan
    Umlauf, Nikolaus
    Wechselberger, Peter
    Harttgen, Kenneth
    Kneib, Thomas
    STATISTICS AND COMPUTING, 2014, 24 (02) : 223 - 238
  • [27] Multilevel structured additive regression
    Stefan Lang
    Nikolaus Umlauf
    Peter Wechselberger
    Kenneth Harttgen
    Thomas Kneib
    Statistics and Computing, 2014, 24 : 223 - 238
  • [28] Propriety of posteriors in structured additive regression models: Theory and empirical evidence
    Fahrmeir, Ludwig
    Kneib, Thomas
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2009, 139 (03) : 843 - 859
  • [29] Simultaneous selection of variables and smoothing parameters in structured additive regression models
    Belitz, Christiane
    Lang, Stefan
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 53 (01) : 61 - 81
  • [30] Multivariate distributional stochastic frontier models
    Schmidt, Rouven
    Kneib, Thomas
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2023, 187