Sparsity of higher-order landscape interactions enables learning and prediction for microbiomes
被引:8
|
作者:
Arya, Shreya
论文数: 0引用数: 0
h-index: 0
机构:
Univ Illinois, Dept Phys, Urbana, IL 61801 USAUniv Illinois, Dept Phys, Urbana, IL 61801 USA
Arya, Shreya
[1
]
George, Ashish B.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Illinois, Carl R Woese Inst Genom Biol, Ctr Artificial Intelligence & Modeling, Urbana, IL 61801 USA
Broad Inst Massachusetts Inst Technol & Harvard, Cambridge 02142, MA USA
Univ Illinois, Dept Plant Biol, Urbana, IL 61801 USAUniv Illinois, Dept Phys, Urbana, IL 61801 USA
George, Ashish B.
[2
,3
,4
]
O'Dwyer, James P.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Illinois, Carl R Woese Inst Genom Biol, Ctr Artificial Intelligence & Modeling, Urbana, IL 61801 USA
Univ Illinois, Dept Plant Biol, Urbana, IL 61801 USAUniv Illinois, Dept Phys, Urbana, IL 61801 USA
O'Dwyer, James P.
[2
,4
]
机构:
[1] Univ Illinois, Dept Phys, Urbana, IL 61801 USA
[2] Univ Illinois, Carl R Woese Inst Genom Biol, Ctr Artificial Intelligence & Modeling, Urbana, IL 61801 USA
[3] Broad Inst Massachusetts Inst Technol & Harvard, Cambridge 02142, MA USA
[4] Univ Illinois, Dept Plant Biol, Urbana, IL 61801 USA
Microbiome engineering offers the potential to leverage microbial communities to improve outcomes in human health, agriculture, and climate. To translate this potential into reality, it is crucial to reliably predict community composition and function. But a brute force approach to cataloging community function is hindered by the combinatorial explosion in the number of ways we can combine microbial species. An alternative is to parameterize microbial community outcomes using simplified, mechanistic models, and then extrapolate these models beyond where we have sampled. But these approaches remain data-hungry, as well as requiring an a priori specification of what kinds of mechanisms are included and which are omitted. Here, we resolve both issues by introducing a mechanism-agnostic approach to predicting microbial community compositions and functions using limited data. The critical step is the identification of a sparse representation of the community landscape. We then leverage this sparsity to predict community compositions and functions, drawing from techniques in compressive sensing. We validate this approach on in silico community data, generated from a theoretical model. By sampling just similar to 1% of all possible communities, we accurately predict community compositions out of sample. We then demonstrate the real-world application of our approach by applying it to four experimental datasets and showing that we can recover interpretable, accurate predictions on composition and community function from highly limited data.