Predicting essential genes in prokaryotic genomes using a linear method: ZUPLS

被引:20
|
作者
Song, Kai [1 ]
Tong, Tuopong [1 ]
Wu, Fang [1 ]
机构
[1] Tianjin Univ, Sch Chem Engn & Technol, Tianjin 300072, Peoples R China
基金
中国国家自然科学基金;
关键词
UNINFORMATIVE VARIABLE ELIMINATION; SHORT CODING SEQUENCES; MYCOBACTERIUM-TUBERCULOSIS; GRAM STAIN; RECOGNITION; SELECTION; PATTERN; BIOLOGY; BIAS;
D O I
10.1039/c3ib40241j
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
An effective linear method, ZUPLS, was developed to improve the accuracy and speed of prokaryotic essential gene identification. ZUPLS only uses the Z-curve and other sequence-based features. Such features can be calculated readily from the DNA/amino acid sequences. Therefore, no well-studied biological network knowledge is required for using ZUPLS. This significantly simplifies essential gene identification, especially for newly sequenced species. ZUPLS can also select necessary features automatically by embedding the uninformative variable elimination tool into the partial least squares classifier. No optimized modelling parameters are needed. ZUPLS has been used, herein, to predict essential genes of 12 remotely related prokaryotes to test its performance. The cross-organism predictions yielded AUC (Area Under the Curve) scores between 0.8042 and 0.9319 by using E. coli genes as the training samples. Similarly, ZUPLS achieved AUC scores between 0.8111 and 0.9371 by using B. subtilis genes as the training samples. We also compared it with the best available results of the existing approaches for further testing. The improvement of the AUC score in predicting B. subtilis essential genes using E. coli genes was 0.13. Additionally, in predicting E. coli essential genes using P. aeruginosa genes, the significant improvement was 0.10. Similarly, the exceptional improvement of the average accuracy of M. pulmonis using M. genitalium and M. pulmonis genes was 14.7%. The combined superior feature extraction and selection power of ZUPLS enable it to give reliable prediction of essential genes for both Gram-positive/negative organisms and rich/poor culture media.
引用
收藏
页码:460 / 469
页数:10
相关论文
共 50 条
  • [31] Essence of life: essential genes of minimal genomes
    Juhas, Mario
    Eberl, Leo
    Glass, John I.
    TRENDS IN CELL BIOLOGY, 2011, 21 (10) : 562 - 568
  • [32] Assembly complexity of prokaryotic genomes using short reads
    Carl Kingsford
    Michael C Schatz
    Mihai Pop
    BMC Bioinformatics, 11 (1)
  • [33] Predicting function: From genes to genomes and back
    Bork, P
    Dandekar, T
    Diaz-Lazcoz, Y
    Eisenhaber, F
    Huynen, M
    Yuan, YP
    JOURNAL OF MOLECULAR BIOLOGY, 1998, 283 (04) : 707 - 725
  • [34] Assembly complexity of prokaryotic genomes using short reads
    Kingsford, Carl
    Schatz, Michael C.
    Pop, Mihai
    BMC BIOINFORMATICS, 2010, 11
  • [35] Hypergraphs for predicting essential genes using multiprotein complex data
    Klimm, Florian
    Deane, Charlotte M.
    Reinert, Gesine
    JOURNAL OF COMPLEX NETWORKS, 2021, 9 (02)
  • [36] Predicting essential genes of 41 prokaryotes by a semi-supervised method
    Liu, Xiao
    He, Ting
    Guo, Zhirui
    Ren, Meixiang
    Luo, Yachuan
    ANALYTICAL BIOCHEMISTRY, 2020, 609
  • [37] A Statistical Framework for Improving Genomic Annotations of Prokaryotic Essential Genes
    Deng, Jingyuan
    Su, Shengchang
    Lin, Xiaodong
    Hassett, Daniel J.
    Lu, Long Jason
    PLOS ONE, 2013, 8 (03):
  • [38] Diversity of 16S rRNA Genes within Individual Prokaryotic Genomes
    Pei, Anna Y.
    Oberdorf, William E.
    Nossa, Carlos W.
    Agarwal, Ankush
    Chokshi, Pooja
    Gerz, Erika A.
    Jin, Zhida
    Lee, Peng
    Yang, Liying
    Poles, Michael
    Brown, Stuart M.
    Sotero, Steven
    DeSantis, Todd
    Brodie, Eoin
    Nelson, Karen
    Pei, Zhiheng
    APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2010, 76 (12) : 3886 - 3897
  • [39] Diversity of 23S rRNA Genes within Individual Prokaryotic Genomes
    Pei, Anna
    Nossa, Carlos W.
    Chokshi, Pooja
    Blaser, Martin J.
    Yang, Liying
    Rosmarin, David M.
    Pei, Zhiheng
    PLOS ONE, 2009, 4 (05):
  • [40] Diversity of 5S rRNA genes within individual prokaryotic genomes
    Pei, Anna
    Li, Hongru
    Oberdorf, William E.
    Alekseyenko, Alexander V.
    Parsons, Tamasha
    Yang, Liying
    Gerz, Erika A.
    Lee, Peng
    Xiang, Charlie
    Nossa, Carlos W.
    Pei, Zhiheng
    FEMS MICROBIOLOGY LETTERS, 2012, 335 (01) : 11 - 18