hsegHMM: hidden Markov model-based allele-specific copy number alteration analysis accounting for hypersegmentation

被引:1
|
作者
Choo-Wosoba, Hyoyoung [1 ]
Albert, Paul S. [1 ]
Zhu, Bin [1 ]
机构
[1] NCI, Biostat Branch, Div Canc Epidemiol & Genet, NIH, Bethesda, MD 20892 USA
来源
BMC BIOINFORMATICS | 2018年 / 19卷
基金
美国国家卫生研究院;
关键词
Allele-specific somatic copy number alteration; Hidden Markov model; Hypersegmentation; Next-generation sequencing; The cancer genome Atlas study; SNP GENOTYPING DATA; CANCER;
D O I
10.1186/s12859-018-2412-y
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundSomatic copy number alternation (SCNA) is a common feature of the cancer genome and is associated with cancer etiology and prognosis. The allele-specific SCNA analysis of a tumor sample aims to identify the allele-specific copy numbers of both alleles, adjusting for the ploidy and the tumor purity. Next generation sequencing platforms produce abundant read counts at the base-pair resolution across the exome or whole genome which is susceptible to hypersegmentation, a phenomenon where numerous regions with very short length are falsely identified as SCNA.ResultsWe propose hsegHMM, a hidden Markov model approach that accounts for hypersegmentation for allele-specific SCNA analysis. hsegHMM provides statistical inference of copy number profiles by using an efficient E-M algorithm procedure. Through simulation and application studies, we found that hsegHMM handles hypersegmentation effectively with a t-distribution as a part of the emission probability distribution structure and a carefully defined state space. We also compared hsegHMM with FACETS which is a current method for allele-specific SCNA analysis. For the application, we use a renal cell carcinoma sample from The Cancer Genome Atlas (TCGA) study.ConclusionsWe demonstrate the robustness of hsegHMM to hypersegmentation. Furthermore, hsegHMM provides the quantification of uncertainty in identifying allele-specific SCNAs over the entire chromosomes. hsegHMM performs better than FACETS when read depth (coverage) is uneven across the genome.
引用
收藏
页数:14
相关论文
共 31 条
  • [31] A hidden markov model-based analysis framework using eye-tracking data to characterise re-orientation strategies in minimally invasive surgery
    Sodergren, Mikael Hans
    Orihuela-Espina, Felipe
    Clark, James
    Darzi, Ara
    Yang, Guang-Zhong
    COGNITIVE PROCESSING, 2010, 11 (03) : 275 - 283