An empirical Bayesian framework for somatic mutation detection from cancer genome sequencing data

被引:172
|
作者
Shiraishi, Yuichi [1 ]
Sato, Yusuke [2 ,3 ]
Chiba, Kenichi [1 ]
Okuno, Yusuke [2 ]
Nagata, Yasunobu [2 ]
Yoshida, Kenichi [2 ]
Shiba, Norio [2 ,4 ]
Hayashi, Yasuhide [4 ]
Kume, Haruki [3 ]
Homma, Yukio [3 ]
Sanada, Masashi [2 ]
Ogawa, Seishi [2 ]
Miyano, Satoru [1 ]
机构
[1] Univ Tokyo, Lab DNA Informat Anal, Ctr Human Genome, Inst Med Sci,Minato Ku, Tokyo 1088639, Japan
[2] Univ Tokyo, Canc Genom Project, Grad Sch Med, Bunkyo Ku, Tokyo 1138655, Japan
[3] Univ Tokyo, Dept Urol, Grad Sch Med, Bunkyo Ku, Tokyo 1138655, Japan
[4] Gunma Childrens Med Ctr, Dept Hematol Oncol, Gunma 3770061, Japan
关键词
ALIGNMENT; EVOLUTION; VARIANTS;
D O I
10.1093/nar/gkt126
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Recent advances in high-throughput sequencing technologies have enabled a comprehensive dissection of the cancer genome clarifying a large number of somatic mutations in a wide variety of cancer types. A number of methods have been proposed for mutation calling based on a large amount of sequencing data, which is accomplished in most cases by statistically evaluating the difference in the observed allele frequencies of possible single nucleotide variants between tumours and paired normal samples. However, an accurate detection of mutations remains a challenge under low sequencing depths or tumour contents. To overcome this problem, we propose a novel method, Empirical Bayesian mutation Calling ( ext-link-type="uri" xlink:href="https://github.com/friend1ws/EBCall" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/friend1ws/EBCall), for detecting somatic mutations. Unlike previous methods, the proposed method discriminates somatic mutations from sequencing errors based on an empirical Bayesian framework, where the model parameters are estimated using sequencing data from multiple non-paired normal samples. Using 13 whole-exome sequencing data with 87.5-206.3 mean sequencing depths, we demonstrate that our method not only outperforms several existing methods in the calling of mutations with moderate allele frequencies but also enables accurate calling of mutations with low allele frequencies (10%) harboured within a minor tumour subpopulation, thus allowing for the deciphering of fine substructures within a tumour specimen.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Heterogeneous Cloud Framework for Big Data Genome Sequencing
    Wang, Chao
    Li, Xi
    Chen, Peng
    Wang, Aili
    Zhou, Xuehai
    Yu, Hong
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2015, 12 (01) : 166 - 178
  • [32] Dynamic genomic indexing enables accurate somatic variant detection from cancer genome sequencing without sequence alignment limitations
    Lee, HoJoon
    Parker, Jacob J.
    Bell, John
    Ji, Hanlee P.
    CANCER RESEARCH, 2016, 76
  • [33] On somatic mutation detection without matched control data
    Yamamoto, Mako
    Chiba, Kenichi
    Okada, Ai
    Miyano, Satoru
    Shiraishi, Yuichi
    CANCER SCIENCE, 2018, 109 : 786 - 786
  • [34] A Bayesian framework for safety signal detection from medical device data
    Xu, Jianjin
    Chakraborty, Adrijo
    Sachdeva, Archie
    Tiwari, Ram
    JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2025,
  • [35] A Bayesian Framework to Identify Methylcytosines from High-Throughput Bisulfite Sequencing Data
    Xie, Qing
    Liu, Qi
    Mao, Fengbiao
    Cai, Wanshi
    Wu, Honghu
    You, Mingcong
    Wang, Zhen
    Chen, Bingyu
    Sun, Zhong Sheng
    Wu, Jinyu
    PLOS COMPUTATIONAL BIOLOGY, 2014, 10 (09)
  • [36] Copy number variation detection in whole-genome sequencing data using the Bayesian information criterion
    Xi, Ruibin
    Hadjipanayis, Angela G.
    Luquette, Lovelace J.
    Kim, Tae-Min
    Lee, Eunjung
    Zhang, Jianhua
    Johnson, Mark D.
    Muzny, Donna M.
    Wheeler, David A.
    Gibbs, Richard A.
    Kucherlapati, Raju
    Park, Peter J.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (46) : E1128 - E1136
  • [37] Identification and Interpretation of Clinically Relevant Somatic Variants from Whole-Genome Sequencing Data
    Maqbool, Khurram
    Foroughi-Asl, Hassan Hassan
    Jeggari, Ashwini Ashwini
    Ivanchuk, Vadym
    Eisfeldt, Jesper
    Renevey, Annick
    Elhami, Keyvan
    Rasi, Chiara
    Nilsson, Daniel
    Heinaniemi, Merja
    Lohi, Olli
    Wirta, Valtteri
    BLOOD, 2022, 140 : 13003 - 13004
  • [38] Pan-cancer discovery of somatic mutations from RNA sequencing data
    Tang, Gongyu
    Liu, Xinyi
    Cho, Minsu
    Li, Yuanxiang
    Tran, Dan-Ho
    Wang, Xiaowei
    COMMUNICATIONS BIOLOGY, 2024, 7 (01)
  • [39] SEG - A Software Program for Finding Somatic Copy Number Alterations in Whole Genome Sequencing Data of Cancer
    Zhang, Mucheng
    Liu, Deli
    Tang, Jie
    Feng, Yuan
    Wang, Tianfang
    Dobbin, Kevin K.
    Schliekelman, Paul
    Zhao, Shaying
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2018, 16 : 335 - 341
  • [40] SomaticSniper: identification of somatic point mutations in whole genome sequencing data
    Larson, David E.
    Harris, Christopher C.
    Chen, Ken
    Koboldt, Daniel C.
    Abbott, Travis E.
    Dooling, David J.
    Ley, Timothy J.
    Mardis, Elaine R.
    Wilson, Richard K.
    Ding, Li
    BIOINFORMATICS, 2012, 28 (03) : 311 - 317