Fast and accurate genomic analyses using genome graphs

被引:128
|
作者
Rakocevic, Goran [1 ,2 ]
Semenyuk, Vladimir [1 ,2 ]
Lee, Wan-Ping [1 ]
Spencer, James [1 ,2 ]
Browning, John [1 ,2 ]
Johnson, Ivan J. [1 ,2 ]
Arsenijevic, Vladan [1 ,2 ]
Nadj, Jelena [1 ,2 ]
Ghose, Kaushik [1 ,2 ]
Suciu, Maria C. [1 ,2 ]
Ji, Sun-Gou [1 ,2 ]
Demir, Gulfem [1 ,2 ]
Li, Lizao [1 ,2 ]
Toptas, Berke C. [1 ,2 ]
Dolgoborodov, Alexey [1 ]
Pollex, Bjorn [1 ,2 ]
Spulber, Iosif [1 ]
Glotova, Irina [1 ,2 ]
Komar, Peter [1 ,2 ]
Stachyra, Andrew L. [1 ,2 ]
Li, Yilong [1 ,2 ]
Popovic, Milos [1 ,2 ]
Kallberg, Morten [1 ]
Jain, Amit [1 ,2 ]
Kural, Deniz [1 ,2 ]
机构
[1] Seven Bridges Genom Inc, Cambridge, MA 02129 USA
[2] Totient Inc, Cambridge, MA 02140 USA
关键词
SHORT READ ALIGNMENT; DISCOVERY; SEQUENCE; PROVIDES; QUALITY; LOCI; MAP;
D O I
10.1038/s41588-018-0316-4
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The human reference genome serves as the foundation for genomics by providing a scaffold for alignment of sequencing reads, but currently only reflects a single consensus haplotype, thus impairing analysis accuracy. Here we present a graph reference genome implementation that enables read alignment across 2,800 diploid genomes encompassing 12.6 million SNPs and 4.0 million insertions and deletions (indels). The pipeline processes one whole-genome sequencing sample in 6.5 h using a system with 36 CPU cores. We show that using a graph genome reference improves read mapping sensitivity and produces a 0.5% increase in variant calling recall, with unaffected specificity. Structural variations incorporated into a graph genome can be genotyped accurately under a unified framework. Finally, we show that iterative augmentation of graph genomes yields incremental gains in variant calling accuracy. Our implementation is an important advance toward fulfilling the promise of graph genomes to radically enhance the scalability and accuracy of genomic analyses.
引用
收藏
页码:354 / +
页数:11
相关论文
共 50 条
  • [1] Fast and accurate genomic analyses using genome graphs
    Goran Rakocevic
    Vladimir Semenyuk
    Wan-Ping Lee
    James Spencer
    John Browning
    Ivan J. Johnson
    Vladan Arsenijevic
    Jelena Nadj
    Kaushik Ghose
    Maria C. Suciu
    Sun-Gou Ji
    Gülfem Demir
    Lizao Li
    Berke Ç. Toptaş
    Alexey Dolgoborodov
    Björn Pollex
    Iosif Spulber
    Irina Glotova
    Péter Kómár
    Andrew L. Stachyra
    Yilong Li
    Milos Popovic
    Morten Källberg
    Amit Jain
    Deniz Kural
    Nature Genetics, 2019, 51 : 354 - 362
  • [2] Fast and Accurate Genome Anchoring Using Fuzzy Hash Maps
    Healy, John
    Chambers, Desmond
    5TH INTERNATIONAL CONFERENCE ON PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY & BIOINFORMATICS (PACBB 2011), 2011, 93 : 149 - +
  • [3] Fast and accurate genome comparison using genome images: The Extended Natural Vector Method
    Pei, Shaojun
    Dong, Wenhui
    Chen, Xiuqiong
    He, Rong Lucy
    Yau, Stephen S. -T.
    MOLECULAR PHYLOGENETICS AND EVOLUTION, 2019, 141
  • [4] Fast and accurate methods for phylogenomic analyses
    Yang, Jimmy
    Warnow, Tandy
    BMC BIOINFORMATICS, 2011, 12
  • [5] Fast and accurate methods for phylogenomic analyses
    Jimmy Yang
    Tandy Warnow
    BMC Bioinformatics, 12
  • [6] GIbPSs: a toolkit for fast and accurate analyses of genotyping-by-sequencing data without a reference genome
    Hapke, A.
    Thiele, D.
    MOLECULAR ECOLOGY RESOURCES, 2016, 16 (04) : 979 - 990
  • [7] Dashing: fast and accurate genomic distances with HyperLogLog
    Daniel N. Baker
    Ben Langmead
    Genome Biology, 20
  • [8] Dashing: fast and accurate genomic distances with HyperLogLog
    Baker, Daniel N.
    Langmead, Ben
    GENOME BIOLOGY, 2019, 20 (01)
  • [9] Genomic resources for functional analyses of the rice genome
    Yang, Ying
    Li, Yan
    Wu, Changyin
    CURRENT OPINION IN PLANT BIOLOGY, 2013, 16 (02) : 157 - 163
  • [10] Fast and accurate analyses of spacecraft dynamics using implicit time integration techniques
    Chang-Joo Kim
    Do Hyeon Lee
    Sung Wook Hur
    Sangkyung Sung
    International Journal of Control, Automation and Systems, 2016, 14 : 524 - 539