Annotation of metagenome short reads using proxygenes

被引:24
|
作者
Dalevi, Daniel [1 ]
Ivanova, Natalia N. [2 ]
Mavromatis, Konstantinos [2 ]
Hooper, Sean D. [2 ]
Szeto, Ernest [1 ]
Hugenholtz, Philip [3 ]
Kyrpides, Nikos C. [2 ]
Markowitz, Victor M. [1 ]
机构
[1] Lawrence Berkeley Natl Lab, Biol Data Management & Technol Ctr, Berkeley, CA 94720 USA
[2] DOE Joint Genome Inst, Genome Biol Program, Walnut Creek, CA 94598 USA
[3] DOE Joint Genome Inst, Microbial Ecol Program, Walnut Creek, CA 94598 USA
关键词
D O I
10.1093/bioinformatics/btn276
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: A typical metagenome dataset generated using a 454 pyrosequencing platform consists of short reads sampled from the collective genome of a microbial community. The amount of sequence in such datasets is usually insufficient for assembly, and traditional gene prediction cannot be applied to unassembled short reads. As a result, analysis of such datasets usually involves comparisons in terms of relative abundances of various protein families. The latter requires assignment of individual reads to protein families, which is hindered by the fact that short reads contain only a fragment, usually small, of a protein. Results: We have considered the assignment of pyrosequencing reads to protein families directly using RPS-BLAST against COG and Pfam databases and indirectly via proxygenes that are identified using BLASTx searches against protein sequence databases. Using simulated metagenome datasets as benchmarks, we show that the proxygene method is more accurate than the direct assignment. We introduce a clustering method which significantly reduces the size of a metagenome dataset while maintaining a faithful representation of its functional and taxonomic content.
引用
收藏
页码:I7 / I13
页数:7
相关论文
共 50 条
  • [1] Clustering Metagenome Short Reads Using Weighted Proteins
    Folino, Gianluigi
    Gori, Fabio
    Jetten, Mike S. M.
    Marchiori, Elena
    EVOLUTIONARY COMPUTATION, MACHINE LEARNING AND DATA MINING IN BIOINFORMATICS, PROCEEDINGS, 2009, 5483 : 152 - +
  • [2] New approaches for metagenome assembly with short reads
    Ayling, Martin
    Clark, Matthew D.
    Leggett, Richard M.
    BRIEFINGS IN BIOINFORMATICS, 2020, 21 (02) : 584 - 594
  • [3] QuasR: quantification and annotation of short reads in R
    Gaidatzis, Dimos
    Lerch, Anita
    Hahne, Florian
    Stadler, Michael B.
    BIOINFORMATICS, 2015, 31 (07) : 1130 - 1132
  • [4] StrainXpress: strain aware metagenome assembly from short reads
    Kang, Xiongbin
    Luo, Xiao
    Schoenhuth, Alexander
    NUCLEIC ACIDS RESEARCH, 2022, 50 (17)
  • [5] MTR: taxonomic annotation of short metagenomic reads using clustering at multiple taxonomic ranks
    Gori, Fabio
    Folino, Gianluigi
    Jetten, Mike S. M.
    Marchiori, Elena
    BIOINFORMATICS, 2011, 27 (02) : 196 - 203
  • [6] Comparative Analysis of Functional Metagenomic Annotation and the Mappability of Short Reads
    Carr, Rogan
    Borenstein, Elhanan
    PLOS ONE, 2014, 9 (08):
  • [7] Metagenome Annotation Using a Distributed Grid of Undergraduate Students
    Hingamp, Pascal
    Brochier, Celine
    Talla, Emmanuel
    Gautheret, Daniel
    Thieffry, Denis
    Herrmann, Carl
    PLOS BIOLOGY, 2008, 6 (11) : 2362 - 2367
  • [8] Combined assembly of long and short sequencing reads improve the efficiency of exploring the soil metagenome
    Xu, Guoshun
    Zhang, Liwen
    Liu, Xiaoqing
    Guan, Feifei
    Xu, Yuquan
    Yue, Haitao
    Huang, Jin-Qun
    Chen, Jieyin
    Wu, Ningfeng
    Tian, Jian
    BMC GENOMICS, 2022, 23 (01)
  • [9] Combined assembly of long and short sequencing reads improve the efficiency of exploring the soil metagenome
    Guoshun Xu
    Liwen Zhang
    Xiaoqing Liu
    Feifei Guan
    Yuquan Xu
    Haitao Yue
    Jin-Qun Huang
    Jieyin Chen
    Ningfeng Wu
    Jian Tian
    BMC Genomics, 23
  • [10] Evaluating techniques for metagenome annotation using simulated sequence data
    Randle-Boggis, Richard J.
    Helgason, Thorunn
    Sapp, Melanie
    Ashton, Peter D.
    FEMS MICROBIOLOGY ECOLOGY, 2016, 92 (07)