An empirical Bayesian framework for somatic mutation detection from cancer genome sequencing data

被引:172
|
作者
Shiraishi, Yuichi [1 ]
Sato, Yusuke [2 ,3 ]
Chiba, Kenichi [1 ]
Okuno, Yusuke [2 ]
Nagata, Yasunobu [2 ]
Yoshida, Kenichi [2 ]
Shiba, Norio [2 ,4 ]
Hayashi, Yasuhide [4 ]
Kume, Haruki [3 ]
Homma, Yukio [3 ]
Sanada, Masashi [2 ]
Ogawa, Seishi [2 ]
Miyano, Satoru [1 ]
机构
[1] Univ Tokyo, Lab DNA Informat Anal, Ctr Human Genome, Inst Med Sci,Minato Ku, Tokyo 1088639, Japan
[2] Univ Tokyo, Canc Genom Project, Grad Sch Med, Bunkyo Ku, Tokyo 1138655, Japan
[3] Univ Tokyo, Dept Urol, Grad Sch Med, Bunkyo Ku, Tokyo 1138655, Japan
[4] Gunma Childrens Med Ctr, Dept Hematol Oncol, Gunma 3770061, Japan
关键词
ALIGNMENT; EVOLUTION; VARIANTS;
D O I
10.1093/nar/gkt126
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Recent advances in high-throughput sequencing technologies have enabled a comprehensive dissection of the cancer genome clarifying a large number of somatic mutations in a wide variety of cancer types. A number of methods have been proposed for mutation calling based on a large amount of sequencing data, which is accomplished in most cases by statistically evaluating the difference in the observed allele frequencies of possible single nucleotide variants between tumours and paired normal samples. However, an accurate detection of mutations remains a challenge under low sequencing depths or tumour contents. To overcome this problem, we propose a novel method, Empirical Bayesian mutation Calling ( ext-link-type="uri" xlink:href="https://github.com/friend1ws/EBCall" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/friend1ws/EBCall), for detecting somatic mutations. Unlike previous methods, the proposed method discriminates somatic mutations from sequencing errors based on an empirical Bayesian framework, where the model parameters are estimated using sequencing data from multiple non-paired normal samples. Using 13 whole-exome sequencing data with 87.5-206.3 mean sequencing depths, we demonstrate that our method not only outperforms several existing methods in the calling of mutations with moderate allele frequencies but also enables accurate calling of mutations with low allele frequencies (10%) harboured within a minor tumour subpopulation, thus allowing for the deciphering of fine substructures within a tumour specimen.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing
    Alioto, Tyler S.
    Buchhalter, Ivo
    Derdak, Sophia
    Hutter, Barbara
    Eldridge, Matthew D.
    Hovig, Eivind
    Heisler, Lawrence E.
    Beck, Timothy A.
    Simpson, Jared T.
    Tonon, Laurie
    Sertier, Anne-Sophie
    Patch, Ann-Marie
    Jaeger, Natalie
    Ginsbach, Philip
    Drews, Ruben
    Paramasivam, Nagarajan
    Kabbe, Rolf
    Chotewutmontri, Sasithorn
    Diessl, Nicolle
    Previti, Christopher
    Schmidt, Sabine
    Brors, Benedikt
    Feuerbach, Lars
    Heinold, Michael
    Groebner, Susanne
    Korshunov, Andrey
    Tarpey, Patrick S.
    Butler, Adam P.
    Hinton, Jonathan
    Jones, David
    Menzies, Andrew
    Raine, Keiran
    Shepherd, Rebecca
    Stebbings, Lucy
    Teague, Jon W.
    Ribeca, Paolo
    Giner, Francesc Castro
    Beltran, Sergi
    Raineri, Emanuele
    Dabad, Marc
    Heath, Simon C.
    Gut, Marta
    Denroche, Robert E.
    Harding, Nicholas J.
    Yamaguchi, Takafumi N.
    Fujimoto, Akihiro
    Nakagawa, Hidewaki
    Quesada, Ctor
    Valdes-Mas, Rafael
    Nakken, Sigve
    NATURE COMMUNICATIONS, 2015, 6
  • [2] A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing
    Tyler S. Alioto
    Ivo Buchhalter
    Sophia Derdak
    Barbara Hutter
    Matthew D. Eldridge
    Eivind Hovig
    Lawrence E. Heisler
    Timothy A. Beck
    Jared T. Simpson
    Laurie Tonon
    Anne-Sophie Sertier
    Ann-Marie Patch
    Natalie Jäger
    Philip Ginsbach
    Ruben Drews
    Nagarajan Paramasivam
    Rolf Kabbe
    Sasithorn Chotewutmontri
    Nicolle Diessl
    Christopher Previti
    Sabine Schmidt
    Benedikt Brors
    Lars Feuerbach
    Michael Heinold
    Susanne Gröbner
    Andrey Korshunov
    Patrick S. Tarpey
    Adam P. Butler
    Jonathan Hinton
    David Jones
    Andrew Menzies
    Keiran Raine
    Rebecca Shepherd
    Lucy Stebbings
    Jon W. Teague
    Paolo Ribeca
    Francesc Castro Giner
    Sergi Beltran
    Emanuele Raineri
    Marc Dabad
    Simon C. Heath
    Marta Gut
    Robert E. Denroche
    Nicholas J. Harding
    Takafumi N. Yamaguchi
    Akihiro Fujimoto
    Hidewaki Nakagawa
    Víctor Quesada
    Rafael Valdés-Mas
    Sigve Nakken
    Nature Communications, 6
  • [3] Detecting somatic point mutations in cancer genome sequencing data: a comparison of mutation callers
    Qingguo Wang
    Peilin Jia
    Fei Li
    Haiquan Chen
    Hongbin Ji
    Donald Hucks
    Kimberly Brown Dahlman
    William Pao
    Zhongming Zhao
    Genome Medicine, 5
  • [4] Detecting somatic point mutations in cancer genome sequencing data: a comparison of mutation callers
    Wang, Qingguo
    Jia, Peilin
    Li, Fei
    Chen, Haiquan
    Ji, Hongbin
    Hucks, Donald
    Dahlman, Kimberly Brown
    Pao, William
    Zhao, Zhongming
    GENOME MEDICINE, 2013, 5
  • [5] HIERARCHICAL BAYESIAN ANALYSIS OF SOMATIC MUTATION DATA IN CANCER
    Ding, Jie
    Trippa, Lorenzo
    Zhong, Xiaogang
    Parmigiani, Giovanni
    ANNALS OF APPLIED STATISTICS, 2013, 7 (02): : 883 - 903
  • [6] Genomon ITDetector: a tool for somatic internal tandem duplication detection from cancer genome sequencing data
    Chiba, Kenichi
    Shiraishi, Yuichi
    Nagata, Yasunobu
    Yoshida, Kenichi
    Imoto, Seiya
    Ogawa, Seishi
    Miyano, Satoru
    BIOINFORMATICS, 2015, 31 (01) : 116 - 118
  • [7] FaSD-somatic: a fast and accurate somatic SNV detection algorithm for cancer genome sequencing data
    Wang, Weixin
    Wang, Panwen
    Xu, Feng
    Luo, Ruibang
    Wong, Maria Pik
    Lam, Tak-Wah
    Wang, Junwen
    BIOINFORMATICS, 2014, 30 (17) : 2498 - 2500
  • [8] Bayesian approach to interpreting somatic cancer sequencing data: a case in point
    Yoon, Ju-Yoon
    Rosenbaum, Jason N.
    Vergara, Norge
    Cohen, Roger B.
    Wilson, Robert B.
    JOURNAL OF CLINICAL PATHOLOGY, 2021, 74 (06) : 403 - 404
  • [9] HapTree-X: An Integrative Bayesian Framework for Haplotype Reconstruction from Transcriptome and Genome Sequencing Data
    Berger, Emily
    Yorukoglu, Deniz
    Berger, Bonnie
    RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY (RECOMB 2015), 2015, 9029 : 28 - 29
  • [10] CleanSeq: A Pipeline for Contamination Detection, Cleanup, and Mutation Verifications from Microbial Genome Sequencing Data
    Wang, Caiyan
    Xia, Yang
    Liu, Yunfei
    Kang, Chen
    Lu, Nan
    Tian, Di
    Lu, Hui
    Han, Fuhai
    Xu, Jian
    Yomo, Tetsuya
    APPLIED SCIENCES-BASEL, 2022, 12 (12):