A Family-Based Probabilistic Method for Capturing De Novo Mutations from High-Throughput Short-Read Sequencing Data

被引：12

作者：

Cartwright, Reed A. ^{[2
]}

Hussin, Julie ^{[3
]}

Keebler, Jonathan E. M. ^{[1
]}

Stone, Eric A. ^{[1
]}

Awadalla, Philip ^{[3
]}

机构：

[1] N Carolina State Univ, Raleigh, NC 27695 USA

[2] Arizona State Univ, Tempe, AZ 85287 USA

[3] Univ Montreal, Montreal, PQ H3C 3J7, Canada

来源：

STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY | 2012年 / 11卷 / 02期

基金：

美国国家卫生研究院;

关键词：

de novo mutations; pedigree; short-read data; mutation rates; trio model; MAXIMUM-LIKELIHOOD; EM ALGORITHM; RATES; DNA; GENOTYPE;

D O I：

10.2202/1544-6115.1713

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

Recent advances in high-throughput DNA sequencing technologies and associated statistical analyses have enabled in-depth analysis of whole-genome sequences. As this technology is applied to a growing number of individual human genomes, entire families are now being sequenced. Information contained within the pedigree of a sequenced family can be leveraged when inferring the donors genotypes. The presence of a de novo mutation within the pedigree is indicated by a violation of Mendelian inheritance laws. Here, we present a method for probabilistically inferring genotypes across a pedigree using high-throughput sequencing data and producing the posterior probability of de novo mutation at each genomic site examined. This framework can be used to disentangle the effects of germline and somatic mutational processes and to simultaneously estimate the effect of sequencing error and the initial genetic variation in the population from which the founders of the pedigree arise. This approach is examined in detail through simulations and areas for method improvement are noted. By applying this method to data from members of a well-defined nuclear family with accurate pedigree information, the stage is set to make the most direct estimates of the human mutation rate to date.

引用

页数：32

共 50 条

[1] Accelerating Error Correction in High-Throughput Short-Read DNA Sequencing Data with CUDA
Shi, Haixiang
Schmidt, Bertil
Liu, Weiguo
Mueller-Wittig, Wolfgang
2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-5, 2009, : 1546 - 1553
[2] A framework for the detection of de novo mutations in family-based sequencing data
Laurent C Francioli
Mircea Cretu-Stancu
Kiran V Garimella
Menachem Fromer
Wigard P Kloosterman
Kaitlin E Samocha
Benjamin M Neale
Mark J Daly
Eric Banks
Mark A DePristo
Paul IW de Bakker
European Journal of Human Genetics, 2017, 25 : 227 - 233
[3] A framework for the detection of de novo mutations in family-based sequencing data
Francioli, Laurent C.
Cretu-Stancu, Mircea
Garimella, Kiran V.
Fromer, Menachem
Kloosterman, Wigard P.
Samocha, Kaitlin E.
Neale, Benjamin M.
Daly, Mark J.
Banks, Eric
DePristo, Mark A.
de Bakker, Paul I. W.
EUROPEAN JOURNAL OF HUMAN GENETICS, 2017, 25 (02) : 227 - 233
[4] Whole-Genome Sequencing and Assembly with High-Throughput, Short-Read Technologies
Sundquist, Andreas
Ronaghi, Mostafa
Tang, Haixu
Pevzner, Pavel
Batzoglou, Serafim
PLOS ONE, 2007, 2 (05):
[5] Base calling for high-throughput short-read sequencing: dynamic programming solutions
Das, Shreepriya
Vikalo, Haris
BMC BIOINFORMATICS, 2013, 14
[6] Base calling for high-throughput short-read sequencing: dynamic programming solutions
Shreepriya Das
Haris Vikalo
BMC Bioinformatics, 14
[7] BayesCall: A model-based base-calling algorithm for high-throughput short-read sequencing
Kao, Wei-Chun
Stevens, Kristian
Song, Yun S.
GENOME RESEARCH, 2009, 19 (10) : 1884 - 1895
[8] HAT: de novo variant calling for highly accurate short-read and long-read sequencing data
Ng, Jeffrey K.
Turner, Tychele N.
BIOINFORMATICS, 2024, 40 (01)
[9] High-throughput Interpretation of Killer-cell Immunoglobulin-like Receptor Short-read Sequencing Data with PING
Marin, Wesley M.
Dandekar, Ravi
Augusto, Danillo G.
Yusufali, Tasneem
Heyn, Bianca
Hofmann, Jan
Lange, Vinzenz
Sauter, Juergen
Norman, Paul J.
Hollenbach, Jill A.
PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (08)
[10] Optimization of de novo transcriptome assembly from high-throughput short read sequencing data improves functional annotation for non-model organisms
Berat Z Haznedaroglu
Darryl Reeves
Hamid Rismani-Yazdi
Jordan Peccia
BMC Bioinformatics, 13

← 1 2 3 4 5 →