Efficient phasing and imputation of low-coverage sequencing data using large reference panels

被引:174
|
作者
Rubinacci, Simone [1 ,2 ]
Ribeiro, Diogo M. [1 ,2 ]
Hofmeister, Robin J. [1 ,2 ]
Delaneau, Olivier [1 ,2 ]
机构
[1] Univ Lausanne, Dept Computat Biol, Lausanne, Switzerland
[2] Univ Lausanne, Swiss Inst Bioinformat, Lausanne, Switzerland
关键词
LINKAGE DISEQUILIBRIUM; GENOTYPE IMPUTATION; GENOME; ASSOCIATION; DISCOVERY; FRAMEWORK; RESOURCE; SNP;
D O I
10.1038/s41588-020-00756-0
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
GLIMPSE is a new method for haplotype phasing and genotype imputation of low-coverage sequencing datasets from large reference panels. GLIMPSE shows remarkable performance across different coverages and human populations. Low-coverage whole-genome sequencing followed by imputation has been proposed as a cost-effective genotyping approach for disease and population genetics studies. However, its competitiveness against SNP arrays is undermined because current imputation methods are computationally expensive and unable to leverage large reference panels. Here, we describe a method, GLIMPSE, for phasing and imputation of low-coverage sequencing datasets from modern reference panels. We demonstrate its remarkable performance across different coverages and human populations. GLIMPSE achieves imputation of a genome for less than US$1 in computational cost, considerably outperforming other methods and improving imputation accuracy over the full allele frequency range. As a proof of concept, we show that 1x coverage enables effective gene expression association studies and outperforms dense SNP arrays in rare variant burden tests. Overall, this study illustrates the promising potential of low-coverage imputation and suggests a paradigm shift in the design of future genomic studies.
引用
收藏
页码:120 / 126
页数:22
相关论文
共 50 条
  • [41] Characterizing Bias in Population Genetic Inferences from Low-Coverage Sequencing Data
    Han, Eunjung
    Sinsheimer, Janet S.
    Novembre, John
    MOLECULAR BIOLOGY AND EVOLUTION, 2014, 31 (03) : 723 - 735
  • [42] Meta-imputation: An efficient method to combine genotype data after imputation with multiple reference panels
    Yu, Ketian
    Das, Sayantan
    LeFaive, Jonathon
    Kwong, Alan
    Pleiness, Jacob
    Forer, Lukas
    Schonherr, Sebastian
    Fuchsberger, Christian
    Smith, Albert Vernon
    Abecasis, Goncalo Rocha
    AMERICAN JOURNAL OF HUMAN GENETICS, 2022, 109 (06) : 1007 - +
  • [43] Rapid Low-Cost Assembly of the Drosophila melanogaster Reference Genome Using Low-Coverage, Long-Read Sequencing
    Solares, Edwin A.
    Chakraborty, Mahul
    Miller, Danny E.
    Kalsow, Shannon
    Hall, Kate
    Perera, Anoja G.
    Emerson, J. J.
    Hawley, R. Scott
    G3-GENES GENOMES GENETICS, 2018, 8 (10): : 3143 - 3154
  • [44] Efficient Mapping and Cloning of Mutations in Zebrafish by Low-Coverage Whole-Genome Sequencing
    Bowen, Margot E.
    Henke, Katrin
    Siegfried, Kellee R.
    Warman, Matthew L.
    Harris, Matthew P.
    GENETICS, 2012, 190 (03) : 1017 - U229
  • [45] Accurate Genotype Imputation in Multiparental Populations from Low-Coverage Sequence
    Zheng, Chaozhi
    Boer, Martin P.
    van Eeuwijk, Fred A.
    GENETICS, 2018, 210 (01) : 71 - 82
  • [46] Detecting selection in low-coverage high-throughput sequencing data using principal component analysis
    Meisner, Jonas
    Albrechtsen, Anders
    Hanghoj, Kristian
    BMC BIOINFORMATICS, 2021, 22 (01)
  • [47] Detecting selection in low-coverage high-throughput sequencing data using principal component analysis
    Jonas Meisner
    Anders Albrechtsen
    Kristian Hanghøj
    BMC Bioinformatics, 22
  • [48] An automated, low-cost library preparation protocol for low-coverage whole genome sequencing-based genotype imputation
    Cattaneo, Pietro
    Cerutti, Lorenzo
    Howald, Cedric
    Bueno, Manuel
    Khatibi, Khatiba
    Mannik, Katrin
    Xenarios, Ioannis
    Harshman, Keith
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 611 - 612
  • [49] Improved Phasing and Imputation for Large-Scale Data
    Browning, Brian L.
    Browning, Sharon R.
    Tian, Xiaowen
    GENETIC EPIDEMIOLOGY, 2017, 41 (07) : 673 - 673
  • [50] Hap-seq: An Optimal Algorithm for Haplotype Phasing with Imputation Using Sequencing Data
    He, Dan
    Han, Buhm
    Eskin, Eleazar
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2013, 20 (02) : 80 - 92