ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data

被引:10475
|
作者
Wang, Kai [1 ]
Li, Mingyao [2 ]
Hakonarson, Hakon [1 ,3 ]
机构
[1] Childrens Hosp Philadelphia, Ctr Appl Genom, Philadelphia, PA 19104 USA
[2] Univ Penn, Dept Biostat & Epidemiol, Philadelphia, PA 19104 USA
[3] Univ Penn, Dept Pediat, Philadelphia, PA 19104 USA
关键词
SNPS; ASSOCIATION; GENOMES;
D O I
10.1093/nar/gkq603
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
High-throughput sequencing platforms are generating massive amounts of genetic variation data for diverse genomes, but it remains a challenge to pinpoint a small subset of functionally important variants. To fill these unmet needs, we developed the ANNOVAR tool to annotate single nucleotide variants (SNVs) and insertions/deletions, such as examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes Project and dbSNP. ANNOVAR can utilize annotation databases from the UCSC Genome Browser or any annotation data set conforming to Generic Feature Format version 3 (GFF3). We also illustrate a 'variants reduction' protocol on 4.7 million SNVs and indels from a human genome, including two causal mutations for Miller syndrome, a rare recessive disease. Through a stepwise procedure, we excluded variants that are unlikely to be causal, and identified 20 candidate genes including the causal gene. Using a desktop computer, ANNOVAR requires similar to 4 min to perform gene-based annotation and similar to 15 min to perform variants reduction on 4.7 million variants, making it practical to handle hundreds of human genomes in a day. ANNOVAR is freely available at http://www.openbioinformatics.org/annovar/.
引用
收藏
页数:7
相关论文
共 50 条
  • [11] MToolBox: a highly automated pipeline for heteroplasmy annotation and prioritization analysis of human mitochondrial variants in high-throughput sequencing
    Calabrese, Claudia
    Simone, Domenico
    Diroma, Maria Angela
    Santorsola, Mariangela
    Gutta, Cristiano
    Gasparre, Giuseppe
    Picardi, Ernesto
    Pesole, Graziano
    Attimonelli, Marcella
    BIOINFORMATICS, 2014, 30 (21) : 3115 - 3117
  • [12] Tools for mapping high-throughput sequencing data
    Fonseca, Nuno A.
    Rung, Johan
    Brazma, Alvis
    Marioni, John C.
    BIOINFORMATICS, 2012, 28 (24) : 3169 - 3177
  • [13] Genome reassembly with high-throughput sequencing data
    Parrish, Nathaniel
    Sudakov, Benjamin
    Eskin, Eleazar
    BMC GENOMICS, 2013, 14
  • [14] Genome reassembly with high-throughput sequencing data
    Nathaniel Parrish
    Benjamin Sudakov
    Eleazar Eskin
    BMC Genomics, 14
  • [15] Compression of Structured High-Throughput Sequencing Data
    Campagne, Fabien
    Dorff, Kevin C.
    Chambwe, Nyasha
    Robinson, James T.
    Mesirov, Jill P.
    PLOS ONE, 2013, 8 (11):
  • [16] High-throughput Functional Annotation of the Caenorhabditis elegans Neural Network
    Aoki, Wataru
    Yokoyama, Haruki
    Matsukura, Hidenori
    Ueda, Mitsuyoshi
    FASEB JOURNAL, 2017, 31
  • [17] High-throughput engineering and functional annotation of cancer fusion genes
    Lu, Hengyu
    Pantazi, Angeliki
    Dogruluk, Turgut
    Dogruluk, Armel
    Creighton, Chad
    Mills, Gordon B.
    Kucherlapati, Raju
    Scott, Kenneth L.
    CANCER RESEARCH, 2015, 75
  • [18] High-throughput functional annotation of somatic driver aberrations in cancer
    Dogruluk, Turgut
    Dogruluk, Armel
    Tsang, Yiu-Huen
    Nair, Nikitha
    Minelli, Rosalba
    Wu, Ping
    Scott, Kenneth L.
    CANCER RESEARCH, 2013, 73
  • [19] Optimization of de novo transcriptome assembly from high-throughput short read sequencing data improves functional annotation for non-model organisms
    Berat Z Haznedaroglu
    Darryl Reeves
    Hamid Rismani-Yazdi
    Jordan Peccia
    BMC Bioinformatics, 13
  • [20] Optimization of de novo transcriptome assembly from high-throughput short read sequencing data improves functional annotation for non-model organisms
    Haznedaroglu, Berat Z.
    Reeves, Darryl
    Rismani-Yazdi, Hamid
    Peccia, Jordan
    BMC BIOINFORMATICS, 2012, 13