Accurate and efficient cell lineage tree inference from noisy single cell data: the maximum likelihood perfect phylogeny approach

被引:17
|
作者
Wu, Yufeng [1 ]
机构
[1] Univ Connecticut, Dept Comp Sci & Engn, Storrs, CT 06269 USA
基金
美国国家科学基金会;
关键词
HETEROGENEITY; NUCLEOTIDE; EVOLUTION;
D O I
10.1093/bioinformatics/btz676
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Cells in an organism share a common evolutionary history, called cell lineage tree. Cell lineage tree can be inferred from single cell genotypes at genomic variation sites. Cell lineage tree inference from noisy single cell data is a challenging computational problem. Most existing methods for cell lineage tree inference assume uniform uncertainty in genotypes. A key missing aspect is that real single cell data usually has non-uniform uncertainty in individual genotypes. Moreover, existing methods are often sampling based and can be very slow for large data. Results: In this article, we propose a new method called ScisTree, which infers cell lineage tree and calls genotypes from noisy single cell genotype data. Different from most existing approaches, ScisTree works with genotype probabilities of individual genotypes (which can be computed by existing single cell genotype callers). ScisTree assumes the infinite sites model. Given uncertain genotypes with individualized probabilities, ScisTree implements a fast heuristic for inferring cell lineage tree and calling the genotypes that allow the so-called perfect phylogeny and maximize the likelihood of the genotypes. Through simulation, we show that ScisTree performs well on the accuracy of inferred trees, and is much more efficient than existing methods. The efficiency of ScisTree enables new applications including imputation of the so-called doublets.
引用
收藏
页码:742 / 750
页数:9
相关论文
共 50 条
  • [21] Isotype-aware inference of B cell clonal lineage trees from-cell data
    Weber, Leah L.
    Reiman, Derek
    Roddur, Mrinmoy S.
    Qi, Yuanyuan
    El-Kebir, Mohammed
    Khan, Aly A.
    CELL GENOMICS, 2024, 4 (09):
  • [22] Inference for systems of stochastic differential equations from discretely sampled data: A numerical maximum likelihood approach
    Lux T.
    Annals of Finance, 2013, 9 (2) : 217 - 248
  • [23] STELLS2: fast and accurate coalescent-based maximum likelihood inference of species trees from gene tree topologies
    Pei, Jingwen
    Wu, Yufeng
    BIOINFORMATICS, 2017, 33 (12) : 1789 - 1797
  • [24] CellPhy: accurate and fast probabilistic inference of single-cell phylogenies from scDNA-seq data
    Alexey Kozlov
    Joao M. Alves
    Alexandros Stamatakis
    David Posada
    Genome Biology, 23
  • [25] CellPhy: accurate and fast probabilistic inference of single-cell phylogenies from scDNA-seq data
    Kozlov, Alexey
    Alves, Joao M.
    Stamatakis, Alexandros
    Posada, David
    GENOME BIOLOGY, 2022, 23 (01)
  • [26] Cell-connectivity-guided trajectory inference from single-cell data
    Smolander, Johannes
    Junttila, Sini
    Elo, Laura L.
    BIOINFORMATICS, 2023, 39 (09)
  • [27] A robust and accurate single-cell data trajectory inference method using ensemble pseudotime
    Yifan Zhang
    Duc Tran
    Tin Nguyen
    Sergiu M. Dascalu
    Frederick C. Harris
    BMC Bioinformatics, 24
  • [28] A robust and accurate single-cell data trajectory inference method using ensemble pseudotime
    Zhang, Yifan
    Tran, Duc
    Nguyen, Tin
    Dascalu, Sergiu M.
    Harris, Frederick C.
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [29] Phylovar: toward scalable phylogeny-aware inference of single-nucleotide variations from single-cell DNA sequencing data
    Edrisi, Mohammadamin
    Valecha, Monica V.
    Chowdary, Sunkara B., V
    Robledo, Sergio
    Ogilvie, Huw A.
    Posada, David
    Zafar, Hamim
    Nakhleh, Luay
    BIOINFORMATICS, 2022, 38 (SUPPL 1) : 195 - 202
  • [30] SiCloneFit: Bayesian inference of population structure, genotype, and phylogeny of tumor clones from single-cell genome sequencing data
    Zafar, Hamim
    Navin, Nicholas
    Chen, Ken
    Nakhleh, Luay
    GENOME RESEARCH, 2019, 29 (11) : 1847 - 1859