Zero-inflated Poisson factor model with application to microbiome read counts

被引:19
|
作者
Xu, Tianchen [1 ]
Demmer, Ryan T. [2 ]
Li, Gen [1 ]
机构
[1] Columbia Univ, Mailman Sch Publ Hlth, Dept Biostat, New York, NY 10032 USA
[2] Univ Minnesota, Sch Publ Hlth, Div Epidemiol, Minneapolis, MN 55455 USA
基金
美国国家卫生研究院;
关键词
16S sequencing; factor analysis; low rank; microbiome data; zero inflation; PERIODONTAL-DISEASE; EXPRESSION;
D O I
10.1111/biom.13272
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Dimension reduction of high-dimensional microbiome data facilitates subsequent analysis such as regression and clustering. Most existing reduction methods cannot fully accommodate the special features of the data such as count-valued and excessive zero reads. We propose a zero-inflated Poisson factor analysis model in this paper. The model assumes that microbiome read counts follow zero-inflated Poisson distributions with library size as offset and Poisson rates negatively related to the inflated zero occurrences. The latent parameters of the model form a low-rank matrix consisting of interpretable loadings and low-dimensional scores that can be used for further analyses. We develop an efficient and robust expectation-maximization algorithm for parameter estimation. We demonstrate the efficacy of the proposed method using comprehensive simulation studies. The application to the Oral Infections, Glucose Intolerance, and Insulin Resistance Study provides valuable insights into the relation between subgingival microbiome and periodontal disease.
引用
收藏
页码:91 / 101
页数:11
相关论文
共 50 条