Influenza sequence validation and annotation using VADR

被引:0
|
作者
Calhoun, Vincent C. [1 ]
Hatcher, Eneida L. [1 ]
Yankie, Linda [1 ]
Nawrocki, Eric P. [1 ]
机构
[1] US Natl Lib Med, Natl Ctr Biotechnol Informat, 8600 Rockville Pike, Bethesda, MD 20894 USA
来源
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION | 2024年 / 2024卷
基金
美国国家卫生研究院;
关键词
A VIRUS; PROTEIN; GENERATION;
D O I
10.1093/database/baae091
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Tens of thousands of influenza sequences are deposited into the GenBank database each year. The software tool FLu ANnotation tool (FLAN) has been used by GenBank since 2007 to validate and annotate incoming influenza sequence submissions and has been publicly available as a webserver but not as a standalone tool. Viral Annotation DefineR (VADR) is a general sequence validation and annotation software package used by GenBank for norovirus, dengue virus and SARS-CoV-2 virus sequence processing that is available as a standalone tool. We have created VADR influenza models based on the FLAN reference sequences and adapted VADR to accurately annotate influenza sequences. VADR and FLAN show consistent results on the vast majority of influenza sequences, and when they disagree, VADR is usually correct. VADR can also accurately process influenza D sequences as well as influenza A H17, H18, H19, N10 and N11 subtype sequences, which FLAN cannot. VADR 1.6.3 and the associated influenza models are now freely available for users to download and use.Database URL: https://bitbucket.org/nawrockie/vadr-models-flu.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] VADR: validation and annotation of virus sequence submissions to GenBank
    Alejandro A. Schäffer
    Eneida L. Hatcher
    Linda Yankie
    Lara Shonkwiler
    J. Rodney Brister
    Ilene Karsch-Mizrachi
    Eric P. Nawrocki
    BMC Bioinformatics, 21
  • [2] VADR: validation and annotation of virus sequence submissions to GenBank
    Schaffer, Alejandro A.
    Hatcher, Eneida L.
    Yankie, Linda
    Shonkwiler, Lara
    Brister, J. Rodney
    Karsch-Mizrachi, Ilene
    Nawrocki, Eric P.
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [3] Faster SARS-CoV-2 sequence validation and annotation for GenBank using VADR
    Nawrocki, Eric P.
    NAR GENOMICS AND BIOINFORMATICS, 2023, 5 (01)
  • [4] Annotation framework validation using domain models
    Noguera, Carlos
    Duchien, Laurence
    MODEL DRIVEN ARCHITECTURE - FOUNDATIONS AND APPLICATIONS, PROCEEDINGS, 2008, 5095 : 48 - 62
  • [5] Evaluation of annotation strategies using an entire genome sequence
    Iliopoulos, I
    Tsoka, S
    Andrade, MA
    Enright, AJ
    Carroll, M
    Poullet, P
    Promponas, V
    Liakopoulos, T
    Palaios, G
    Pasquier, C
    Hamodrakas, S
    Tamames, J
    Yagnik, AT
    Tramontano, A
    Devos, D
    Blaschke, C
    Valencia, A
    Brett, D
    Martin, D
    Leroy, C
    Rigoutsos, I
    Sander, C
    Ouzounis, CA
    BIOINFORMATICS, 2003, 19 (06) : 717 - 726
  • [6] Drosophila genomic sequence annotation using the BLOCKS plus database
    Henikoff, JG
    Henikoff, S
    GENOME RESEARCH, 2000, 10 (04) : 543 - 546
  • [7] A new approach for gene annotation using unambiguous sequence joining
    Tchourbanov, A
    Quest, D
    Ali, H
    Pauley, M
    Norgren, R
    PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE, 2003, : 353 - 362
  • [8] Evaluating techniques for metagenome annotation using simulated sequence data
    Randle-Boggis, Richard J.
    Helgason, Thorunn
    Sapp, Melanie
    Ashton, Peter D.
    FEMS MICROBIOLOGY ECOLOGY, 2016, 92 (07)
  • [9] Sequence to Sequence with Attention for Influenza Prevalence Prediction using Google Trends
    Kondo, Kenjiro
    Ishikawa, Akihiko
    Kimura, Masashi
    ICCBB 2019: PROCEEDINGS OF THE 2019 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2019, : 1 - 7
  • [10] Evading the annotation bottleneck: using sequence similarity to search non-sequence gene data
    Gilchrist, Michael J.
    Christensen, Mikkel B.
    Harland, Richard
    Pollet, Nicolas
    Smith, James C.
    Ueno, Naoto
    Papalopulu, Nancy
    BMC BIOINFORMATICS, 2008, 9 (1) : 442