Recovering Escherichia coli Plasmids in the Absence of Long-Read Sequencing Data

被引:12
|
作者
Paganini, Julian A. [1 ]
Plantinga, Nienke L. [1 ]
Arredondo-Alonso, Sergio [2 ,3 ]
Willems, Rob J. L. [1 ]
Schurch, Anita C. [1 ]
机构
[1] Univ Med Ctr Utrecht, Dept Med Microbiol, NL-3584 CX Utrecht, Netherlands
[2] Univ Oslo, Fac Med, Dept Biostat, N-0372 Oslo, Norway
[3] Wellcome Sanger Inst, Parasites & Microbes, Cambridge CB10 1SA, England
基金
欧盟地平线“2020”;
关键词
WGS; plasmids; antibiotic resistance; bioinformatics; Escherichia coli; RESISTANCE GENES; EPIDEMIOLOGY;
D O I
10.3390/microorganisms9081613
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
The incidence of infections caused by multidrug-resistant E. coli strains has risen in the past years. Antibiotic resistance in E. coli is often mediated by acquisition and maintenance of plasmids. The study of E. coli plasmid epidemiology and genomics often requires long-read sequencing information, but recently a number of tools that allow plasmid prediction from short-read data have been developed. Here, we reviewed 25 available plasmid prediction tools and categorized them into binary plasmid/chromosome classification tools and plasmid reconstruction tools. We benchmarked six tools (MOB-suite, plasmidSPAdes, gplas, FishingForPlasmids, HyAsP and SCAPP) that aim to reliably reconstruct distinct plasmids, with a special focus on plasmids carrying antibiotic resistance genes (ARGs) such as extended-spectrum beta-lactamase genes. We found that two thirds (n = 425, 66.3%) of all plasmids were correctly reconstructed by at least one of the six tools, with a range of 92 (14.58%) to 317 (50.23%) correctly predicted plasmids. However, the majority of plasmids that carried antibiotic resistance genes (n = 85, 57.8%) could not be completely recovered as distinct plasmids by any of the tools. MOB-suite was the only tool that was able to correctly reconstruct the majority of plasmids (n = 317, 50.23%), and performed best at reconstructing large plasmids (n = 166, 46.37%) and ARG-plasmids (n = 41, 27.9%), but predictions frequently contained chromosome contamination (40%). In contrast, plasmidSPAdes reconstructed the highest fraction of plasmids smaller than 18 kbp (n = 168, 61.54%). Large ARG-plasmids, however, were frequently merged with sequences derived from distinct replicons. Available bioinformatic tools can provide valuable insight into E. coli plasmids, but also have important limitations. This work will serve as a guideline for selecting the most appropriate plasmid reconstruction tool for studies focusing on E. coli plasmids in the absence of long-read sequencing data.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] De Novo Structural Variations of Escherichia coli Detected by Nanopore Long-Read Sequencing
    Zhou, Xia
    Pan, Jiao
    Wang, Yaohai
    Lynch, Michael
    Long, Hongan
    Zhang, Yu
    GENOME BIOLOGY AND EVOLUTION, 2023, 15 (06):
  • [2] Detection of Live Shiga Toxin-Producing Escherichia coli with Long-Read Sequencing
    Counihan, Katrina L.
    Tilman, Shannon
    Chen, Chin-Yi
    He, Yiping
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2025, 26 (05)
  • [3] Long-read sequencing data analysis for yeasts
    Yue, Jia-Xing
    Liti, Gianni
    NATURE PROTOCOLS, 2018, 13 (06) : 1213 - 1231
  • [4] Long-read sequencing data analysis for yeasts
    Jia-Xing Yue
    Gianni Liti
    Nature Protocols, 2018, 13 : 1213 - 1231
  • [5] Opportunities and challenges in long-read sequencing data analysis
    Shanika L. Amarasinghe
    Shian Su
    Xueyi Dong
    Luke Zappia
    Matthew E. Ritchie
    Quentin Gouil
    Genome Biology, 21
  • [6] NanoPack: visualizing and processing long-read sequencing data
    De Coster, Wouter
    D'Hert, Svenn
    Schultz, Darrin T.
    Cruts, Marc
    Van Broeckhoven, Christine
    BIOINFORMATICS, 2018, 34 (15) : 2666 - 2669
  • [7] Opportunities and challenges in long-read sequencing data analysis
    Amarasinghe, Shanika L.
    Su, Shian
    Dong, Xueyi
    Zappia, Luke
    Ritchie, Matthew E.
    Gouil, Quentin
    GENOME BIOLOGY, 2020, 21 (01)
  • [8] Genome sequencing using long-read sequencing
    McEwen, Juan Guillermo
    Gomez, Oscar Mauricio
    REVISTA DE LA ACADEMIA COLOMBIANA DE CIENCIAS EXACTAS FISICAS Y NATURALES, 2023, 47 (183): : 439 - 444
  • [9] Detecting Phase Effects Using Long-Read Sequencing Data
    He, Gengming
    Mastromatteo, Scott
    Keenan, Katherine
    Strug, Lisa
    GENETIC EPIDEMIOLOGY, 2024, 48 (07) : 360 - 360
  • [10] NanoGalaxy: Nanopore long-read sequencing data analysis in Galaxy
    de Koning, Willem
    Miladi, Milad
    Hiltemann, Saskia
    Heikema, Astrid
    Hays, John P.
    Flemming, Stephan
    van den Beek, Marius
    Mustafa, Dana A.
    Backofen, Rolf
    Gruening, Bjoern
    Stubbs, Andrew P.
    GIGASCIENCE, 2020, 9 (10):