Identifiability and inference of phylogenetic birth-death models

被引:4
|
作者
Legried, Brandon [1 ]
Terhorst, Jonathan [2 ]
机构
[1] Georgia Inst Technol, Sch Math, 686 Cherry St, Atlanta, GA 30332 USA
[2] Univ Michigan, Dept Stat, 1085 S Univ Ave, Ann Arbor, MI 48109 USA
基金
美国国家科学基金会;
关键词
Phylogenetics; Phylodynamics; Birth-death model; Identifiability; RATES; TIME;
D O I
10.1016/j.jtbi.2023.111520
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent theoretical work on phylogenetic birth-death models offers differing viewpoints on whether they can be estimated using lineage-through-time data. Louca and Pennell (2020) showed that the class of models with continuously differentiable rate functions is nonidentifiable: any such model is consistent with an infinite collection of alternative models, which are statistically indistinguishable regardless of how much data are collected. Legried and Terhorst (2022) qualified this grave result by showing that identifiability is restored if only piecewise constant rate functions are considered. Here, we contribute new theoretical results to this discussion, in both the positive and negative directions. Our main result is to prove that models based on piecewise polynomial rate functions of any order and with any (finite) number of pieces are statistically identifiable. In particular, this implies that spline-based models with an arbitrary number of knots are identifiable. The proof is simple and self-contained, relying mainly on basic algebra. We complement this positive result with a negative one, which shows that even when identifiability holds, rate function estimation is still a difficult problem. To illustrate this, we prove some rates-of-convergence results for hypothesis testing using birth-death models. These results are information-theoretic lower bounds which apply to all potential estimators.
引用
收藏
页数:9
相关论文
共 50 条