BAYESIAN MIXED EFFECTS MODELS FOR ZERO-INFLATED COMPOSITIONS IN MICROBIOME DATA ANALYSIS

被引:8
|
作者
Ren, Boyu [1 ]
Bacallado, Sergio [2 ]
Favaro, Stefano [3 ,4 ]
Vatanen, Tommi [5 ]
Huttenhower, Curtis [1 ,5 ]
Trippa, Lorenzo [1 ]
机构
[1] Harvard Univ, Dept Biostat, Cambridge, MA 02138 USA
[2] Univ Cambridge, Dept Pure Math & Math Stat, Cambridge, England
[3] Univ Torino, Departimento Sci Econ Sociali & Matemat Stat, Turin, Italy
[4] Coll Carlo Alberto, Turin, Italy
[5] Univ Auckland, Liggins Inst, Auckland, New Zealand
来源
ANNALS OF APPLIED STATISTICS | 2020年 / 14卷 / 01期
基金
欧洲研究理事会;
关键词
Truncated dependent Dirichlet processes; latent factor model; type; 1; diabetes; MULTINOMIAL REGRESSION; GUT MICROBIOME;
D O I
10.1214/19-AOAS1295
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Detecting associations between microbial compositions and sample characteristics is one of the most important tasks in microbiome studies. Most of the existing methods apply univariate models to single microbial species separately, with adjustments for multiple hypothesis testing. We propose a Bayesian analysis for a generalized mixed effects linear model tailored to this application. The marginal prior on each microbial composition is a Dirichlet process, and dependence across compositions is induced through a linear combination of individual covariates, such as disease biomarkers or the subject's age, and latent factors. The latent factors capture residual variability and their dimensionality is learned from the data in a fully Bayesian procedure. The proposed model is tested in data analyses and simulation studies with zero-inflated compositions. In these settings and within each sample, a large proportion of counts per microbial species are equal to zero. In our Bayesian model a priori the probability of compositions with absent microbial species is strictly positive. We propose an efficient algorithm to sample from the posterior and visualizations of model parameters which reveal associations between covariates and microbial compositions. We evaluate the proposed method in simulation studies, and then analyze a microbiome dataset for infants with type 1 diabetes which contains a large proportion of zeros in the sample-specific microbial compositions.
引用
收藏
页码:494 / 517
页数:24
相关论文
共 50 条
  • [1] Bayesian Analysis of Semiparametric Mixed-Effects Models for Zero-Inflated Count Data
    Xue-Dong, Chen
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2009, 38 (11) : 1815 - 1833
  • [2] Zero-Inflated gaussian mixed models for analyzing longitudinal microbiome data
    Zhang, Xinyan
    Guo, Boyi
    Yi, Nengjun
    PLOS ONE, 2020, 15 (11):
  • [3] Bayesian Analysis for the Zero-inflated Regression Models
    Jane, Hakjin
    Kang, Yunhee
    Lee, S.
    Kim, Seong W.
    KOREAN JOURNAL OF APPLIED STATISTICS, 2008, 21 (04) : 603 - 613
  • [4] NBZIMM: negative binomial and zero-inflated mixed models, with application to microbiome/metagenomics data analysis
    Xinyan Zhang
    Nengjun Yi
    BMC Bioinformatics, 21
  • [5] Bayesian analysis of zero-inflated regression models
    Ghosh, SK
    Mukhopadhyay, P
    Lu, JC
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2006, 136 (04) : 1360 - 1375
  • [6] NBZIMM: negative binomial and zero-inflated mixed models, with application to microbiome/metagenomics data analysis
    Zhang, Xinyan
    Yi, Nengjun
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [7] Bayesian variable selection for multivariate zero-inflated models: Application to microbiome count data
    Lee, Kyu Ha
    Coull, Brent A.
    Moscicki, Anna-Barbara
    Paster, Bruce J.
    Starr, Jacqueline R.
    BIOSTATISTICS, 2020, 21 (03) : 499 - 517
  • [8] A Bayesian nonparametric analysis for zero-inflated multivariate count data with application to microbiome study
    Shuler, Kurtis
    Verbanic, Samuel
    Chen, Irene A.
    Lee, Juhee
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2021, 70 (04) : 961 - 979
  • [9] Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data
    Xu, Lizhen
    Paterson, Andrew D.
    Turpin, Williams
    Xu, Wei
    PLOS ONE, 2015, 10 (07):
  • [10] A Bayesian zero-inflated negative binomial regression model for the integrative analysis of microbiome data
    Jiang, Shuang
    Xiao, Guanghua
    Koh, Andrew Y.
    Kim, Jiwoong
    Li, Qiwei
    Zhan, Xiaowei
    BIOSTATISTICS, 2021, 22 (03) : 522 - 540