On combining family- and population-based sequencing data

被引:0
|
作者
Yuriko Katsumata
David W. Fardo
机构
[1] University of Kentucky College of Public Health,Department of Biostatistics
关键词
Combine Data; Exome Sequencing; Genetic Analysis Workshop; Inflated Type; Variant Call Format;
D O I
10.1186/s12919-016-0026-9
中图分类号
学科分类号
摘要
Several statistical group-based approaches have been proposed to detect effects of variation within a gene for each of the population- and family-based designs. However, unified tests to combine gene-phenotype associations obtained from these 2 study designs are not yet well established. In this study, we investigated the efficient combination of population-based and family-based sequencing data to evaluate best practices using the Genetic Analysis Workshop 19 (GAW19) data set. Because one design employed whole genome sequencing and the other whole exome sequencing, we examined variants overlapping both data sets. We used the family-based sequence kernel association test (famSKAT) to analyze the family- and population-based data sets separately as well as with a combined data set. These were compared against meta-analysis. Using the combined data, we showed that famSKAT has high power to detect associations between diastolic and/or systolic blood pressures and the genes that have causal variants with large effect sizes, such as MAP4, TNN, and CGN. However, when there was a considerable difference in the powers between family- and population-based data, famSKAT with the combined data had lower power than that from the population-based data alone. The famSKAT test statistic for the combined data can be influenced by sample imbalance from the 2 designs. This underscores the importance of foresight in study design as, in this situation, the greatly lower sample size in the family-based data essentially serves to dilute signal. We observed inflated type I errors in our simulation study, largely when using population-based data, which might be a result of principal components failing to completely account for population admixture in this cohort.
引用
收藏
相关论文
共 50 条
  • [31] Effective filtering strategies to improve data quality from population-based whole exome sequencing studies
    Carson, Andrew R.
    Smith, Erin N.
    Matsui, Hiroko
    Braekkan, Sigrid K.
    Jepsen, Kristen
    Hansen, John-Bjarne
    Frazer, Kelly A.
    BMC BIOINFORMATICS, 2014, 15
  • [32] Effective filtering strategies to improve data quality from population-based whole exome sequencing studies
    Andrew R Carson
    Erin N Smith
    Hiroko Matsui
    Sigrid K Brækkan
    Kristen Jepsen
    John-Bjarne Hansen
    Kelly A Frazer
    BMC Bioinformatics, 15
  • [33] Employed family-based genetic discovery combining linkage analysis and exome sequencing to identify RCL1 as a novel candidate gene for depression, with independent replication in a population-based cohort
    N Amin
    F M S de Vrij
    M Baghdadi
    R W W Brouwer
    J G J van Rooij
    O Jovanova
    A G Uitterlinden
    A Hofman
    H L A Janssen
    S Darwish Murad
    R Kraaij
    J Stedehouder
    M C G N van den Hout
    J M Kros
    W F J van IJcken
    H Tiemeier
    S A Kushner
    C M van Duijn
    Molecular Psychiatry, 2018, 23 : 1093 - 1093
  • [34] Employed family-based genetic discovery combining linkage analysis and exome sequencing to identify RCL1 as a novel candidate gene for depression, with independent replication in a population-based cohort
    Amin, N.
    de Vrij, F. M. S.
    Baghdadi, M.
    Brouwer, R. W. W.
    van Rooij, J. G. J.
    Jovanova, O.
    Uitterlinden, A. G.
    Hofman, A.
    Janssen, H. L. A.
    Murad, S. Darwish
    Kraaij, R.
    Stedehouder, J.
    van den Hout, M. C. G. N.
    Kros, J. M.
    van IJcken, W. F. J.
    Tiemeier, H.
    Kushner, S. A.
    van Duijn, C. M.
    MOLECULAR PSYCHIATRY, 2018, 23 (05) : 1093 - 1093
  • [35] A population-based study of the family history of ischemic stroke
    Lisabeth, LD
    Kardia, SL
    Smith, MA
    Fornage, M
    Morgenstern, LB
    STROKE, 2004, 35 (01) : 251 - 251
  • [36] A population-based family study (II): Segregation analysis
    Zhao, LP
    Quiaoit, F
    Hsu, L
    Davidov, O
    Holte, S
    GENETIC EPIDEMIOLOGY, 1997, 14 (06) : 945 - 949
  • [37] Coherence and Completeness of Population-based Family Cancer Reports
    Wideroff, Louise
    Garceau, Anne O.
    Greene, Mark H.
    Dunn, Marsha
    McNeel, Timothy
    Mai, Phuong
    Willis, Gordon
    Gonsalves, Lou
    Martin, Michael
    Graubard, Barry I.
    CANCER EPIDEMIOLOGY BIOMARKERS & PREVENTION, 2010, 19 (03) : 799 - 810
  • [38] A comparative analysis of family-based and population-based association tests using whole genome sequence data
    Jin J Zhou
    Wai-Ki Yip
    Michael H Cho
    Dandi Qiao
    Merry-Lynn N McDonald
    Nan M Laird
    BMC Proceedings, 8 (Suppl 1)
  • [39] Occlusal contact patterns -: Population-based data
    Huetzen, Daniel
    Proff, Peter
    Gedrange, Tomas
    Biffar, Reiner
    Bernhard, Olaf
    Kocher, Thomas
    Kordass, Bernd
    ANNALS OF ANATOMY-ANATOMISCHER ANZEIGER, 2007, 189 (04) : 407 - 411
  • [40] General Class of Family-based Association Tests for Sequence Data, and Comparisons with Population-based Association Tests
    Ionita-Laza, Iuliana
    Lee, Seunggeun
    Makarov, Vladimir
    Buxbaum, Joseph D.
    Lin, Xihong
    GENETIC EPIDEMIOLOGY, 2012, 36 (07) : 720 - 720