Pan-genomic analysis of the species Salmonella enterica: Identification of core essential and putative essential genes

被引:8
|
作者
Chand, Yamini [1 ]
Alam, Md. Afroz [2 ]
Singh, Sachidanand [1 ]
机构
[1] Shri Ramswaroop Mem Univ, Inst Biosci & Technol, Fac Biotechnol, Lucknow Deva Rd, Barabanki 225003, Uttar Pradesh, India
[2] Karunya Inst Technol & Sci, Dept Biotechnol, Coimbatore 641114, Tamil Nadu, India
来源
GENE REPORTS | 2020年 / 20卷
关键词
Salmonella enterica; Comparative genomics; Phylogenetic analysis; BLAST matrix; Pan-genome; Core genome; Dispensable genome; COG classification; Essentiality analysis; DRUG TARGETS; ESCHERICHIA-COLI; DATABASE; DEG; PRIORITIZATION; ANNOTATION; PREDICTION; PROTEINS; SEQUENCE; REVEALS;
D O I
10.1016/j.genrep.2020.100669
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Background: Essential genes are defined as the minimal gene set required to support bacterial life. In order to develop new antimicrobials to treat multidrug-resistant pathogens, such as serovars of Salmonella enterica, the identification of essential genes is crucial. Methodology: In the present work, we hypothesize that essential genes within a group of evolutionary closely related organisms may be highly conserved. We, therefore, conducted an extensive comparative genomic analysis of 44 genome sequences representing 17 serovars of S. enterica to gain an improved understanding of conserved essential genes for its survival. Results: Pan-genome estimates indicate that the genus Salmonella displays an open pan-genome structure comprising a reservoir of 10,775 gene families. Of these, 2847, 4657, and 3271 constitute the core gene families (CGFs), dispensable gene families (DGFs), and strain-specific gene families (SSGFs), respectively. The pan-genome family tree based on the presence/absence of gene families is highly concordant with the 16S rRNA tree, though the former provides a more robust phylogenetic resolution. The Clusters of Orthologous Groups of proteins (COGs) database categorized the vast majority of the CGFs (40.9%) to metabolism, whereas a large proportion of the DGFs (70.6%) was uncharacterized. Homology analysis of the CGFs against the Database of essential genes (DEG) identified 1695 essential CGFs (E-CGFs). Of these, 687 are experimentally verified as essential in Salmonella, 1157 are identified in >= 2 species, 159 are conserved in >= 7 species, and 538 were present in at least one species. Thus, for the species, S. enterica 69%, 52%, and 31% of the genome are dedicated to the core, essential, and dispensable functions, respectively. Conclusion: The E-CGFs identified may serve as important targets for the development of novel antimicrobials, and their detailed analysis may shed new light on a better understanding of Salmonella's survival.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Hunting the plant essential Salmonella enterica genes
    Barak, J.
    PHYTOPATHOLOGY, 2011, 101 (06) : S212 - S212
  • [2] Putative essential and core-essential genes in Mycoplasma genomes
    Yan Lin
    Randy Ren Zhang
    Scientific Reports, 1
  • [3] Putative essential and core-essential genes in Mycoplasma genomes
    Lin, Yan
    Zhang, Randy Ren
    SCIENTIFIC REPORTS, 2011, 1
  • [4] Pan-Genomic Analysis Provides Insights into the Genomic Variation and Evolution of Salmonella Paratyphi A
    Liang, Weili
    Zhao, Yongbing
    Chen, Chunxia
    Cui, Xiaoying
    Yu, Jun
    Xiao, Jingfa
    Kan, Biao
    PLOS ONE, 2012, 7 (09):
  • [5] Genomic variation in Salmonella enterica core genes for epidemiological typing
    Leekitcharoenphon, Pimlapas
    Lukjancenko, Oksana
    Friis, Carsten
    Aarestrup, Frank M.
    Ussery, David W.
    BMC GENOMICS, 2012, 13
  • [6] Identification of virulence genes and clade-specific markers through pan-genomic analysis of Leptospira
    Mohd Abdullah
    Mohammad Kadivella
    Rolee Sharma
    Mirza. S. Baig
    Syed M. Faisal
    Sarwar Azam
    BMC Microbiology, 25 (1)
  • [7] Genomic variation in Salmonella enterica core genes for epidemiological typing
    Pimlapas Leekitcharoenphon
    Oksana Lukjancenko
    Carsten Friis
    Frank M Aarestrup
    David W Ussery
    BMC Genomics, 13
  • [8] Pan-genome Analyses of the Species Salmonella enterica, and Identification of Genomic Markers Predictive for Species, Subspecies, and Serovar
    Laing, Chad R.
    Whiteside, Matthew D.
    Gannon, Victor P. J.
    FRONTIERS IN MICROBIOLOGY, 2017, 8
  • [9] Genomic identification and functional analysis of essential genes in Caenorhabditis elegans
    Yu, Shicheng
    Zheng, Chaoran
    Zhou, Fan
    Baillie, David L.
    Rose, Ann M.
    Deng, Zixin
    Chu, Jeffrey Shih-Chieh
    BMC GENOMICS, 2018, 19
  • [10] Genomic identification and functional analysis of essential genes in Caenorhabditis elegans
    Shicheng Yu
    Chaoran Zheng
    Fan Zhou
    David L. Baillie
    Ann M. Rose
    Zixin Deng
    Jeffrey Shih-Chieh Chu
    BMC Genomics, 19