hsegHMM: hidden Markov model-based allele-specific copy number alteration analysis accounting for hypersegmentation

被引:1
|
作者
Choo-Wosoba, Hyoyoung [1 ]
Albert, Paul S. [1 ]
Zhu, Bin [1 ]
机构
[1] NCI, Biostat Branch, Div Canc Epidemiol & Genet, NIH, Bethesda, MD 20892 USA
来源
BMC BIOINFORMATICS | 2018年 / 19卷
基金
美国国家卫生研究院;
关键词
Allele-specific somatic copy number alteration; Hidden Markov model; Hypersegmentation; Next-generation sequencing; The cancer genome Atlas study; SNP GENOTYPING DATA; CANCER;
D O I
10.1186/s12859-018-2412-y
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundSomatic copy number alternation (SCNA) is a common feature of the cancer genome and is associated with cancer etiology and prognosis. The allele-specific SCNA analysis of a tumor sample aims to identify the allele-specific copy numbers of both alleles, adjusting for the ploidy and the tumor purity. Next generation sequencing platforms produce abundant read counts at the base-pair resolution across the exome or whole genome which is susceptible to hypersegmentation, a phenomenon where numerous regions with very short length are falsely identified as SCNA.ResultsWe propose hsegHMM, a hidden Markov model approach that accounts for hypersegmentation for allele-specific SCNA analysis. hsegHMM provides statistical inference of copy number profiles by using an efficient E-M algorithm procedure. Through simulation and application studies, we found that hsegHMM handles hypersegmentation effectively with a t-distribution as a part of the emission probability distribution structure and a carefully defined state space. We also compared hsegHMM with FACETS which is a current method for allele-specific SCNA analysis. For the application, we use a renal cell carcinoma sample from The Cancer Genome Atlas (TCGA) study.ConclusionsWe demonstrate the robustness of hsegHMM to hypersegmentation. Furthermore, hsegHMM provides the quantification of uncertainty in identifying allele-specific SCNAs over the entire chromosomes. hsegHMM performs better than FACETS when read depth (coverage) is uneven across the genome.
引用
收藏
页数:14
相关论文
共 31 条
  • [21] GPHMM: an integrated hidden Markov model for identification of copy number alteration and loss of heterozygosity in complex tumor samples using whole genome SNP arrays
    Li, Ao
    Liu, Zongzhi
    Lezon-Geyda, Kimberly
    Sarkar, Sudipa
    Lannin, Donald
    Schulz, Vincent
    Krop, Ian
    Winer, Eric
    Harris, Lyndsay
    Tuck, David
    NUCLEIC ACIDS RESEARCH, 2011, 39 (12) : 4928 - 4941
  • [22] Energy demand pattern analysis in South Korea using hidden Markov model-based classification
    Lee, Jaeyong
    Hwang, Beom Seuk
    ASIAN ECONOMIC JOURNAL, 2024, 38 (03) : 404 - 428
  • [23] Identification of Novel L-Amino Acid α-Ligases through Hidden Markov Model-Based Profile Analysis
    Senoo, Akihiro
    Tabata, Kazuhiko
    Yonetani, Yoshiyuki
    Yagasaki, Makoto
    BIOSCIENCE BIOTECHNOLOGY AND BIOCHEMISTRY, 2010, 74 (02) : 415 - 418
  • [24] Hidden Markov Model-Based Statistics Pattern Analysis for Multimode Process Monitoring: An Index-Switching Scheme
    Ning, Chao
    Chen, Maoyin
    Zhou, Donghua
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2014, 53 (27) : 11084 - 11095
  • [25] Allele-Specific Droplet Digital PCR Combined with a Next-Generation Sequencing-Based Algorithm for Diagnostic Copy Number Analysis in Genes with High Homology: Proof of Concept Using Stereocilin
    Amr, Sami S.
    Murphy, Elissa
    Duffy, Elizabeth
    Niazi, Rojeen
    Balciuniene, Jorune
    Luo, Minjie
    Rehm, Heidi L.
    Abou Tayoun, Ahmad N.
    CLINICAL CHEMISTRY, 2018, 64 (04) : 705 - 714
  • [26] A novel hidden Markov model-based adaptive dynamic time warping (HMDTW) gait analysis for identifying physically challenged persons
    Achanta, Sampath Dakshina Murthy
    Karthikeyan, T.
    Vinothkanna, R.
    SOFT COMPUTING, 2019, 23 (18) : 8359 - 8366
  • [27] Statistical analysis of GC-biased gene conversion and recombination hotspots in eukaryotic genomes: a phylogenetic hidden Markov model-based approach
    Gao, Meijun
    Liu, Kevin J.
    12TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS (ACM-BCB 2021), 2021,
  • [28] Appendix - Statistical analysis of GC-biased gene conversion and recombination hotspots in eukaryotic genomes: a phylogenetic hidden Markov model-based approach
    Gao, Meijun
    Liu, Kevin J.
    12TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS (ACM-BCB 2021), 2021,
  • [29] Allele-specific copy number analysis of human osteosarcoma by single nucleotide polymorphism (SNP) array identifies several monoallelic amplicons independent of MYC in chromosome 8 q-arm (8q)
    Lee, Dhong Hyun
    Moore, Stephen
    Hansen, Marc
    Akunowicz, Jennifer
    Miller, Carl
    Sanada, Masashi
    Kato, Motohiro
    Akagi, Tadayuki
    Kawamata, Norihiko
    Ogawa, Seishi
    Schreck, Rhona
    Koeffler, Phillip
    CANCER RESEARCH, 2009, 69
  • [30] A hidden markov model-based analysis framework using eye-tracking data to characterise re-orientation strategies in minimally invasive surgery
    Mikael Hans Sodergren
    Felipe Orihuela-Espina
    James Clark
    Ara Darzi
    Guang-Zhong Yang
    Cognitive Processing, 2010, 11 : 275 - 283