An effective linear method, ZUPLS, was developed to improve the accuracy and speed of prokaryotic essential gene identification. ZUPLS only uses the Z-curve and other sequence-based features. Such features can be calculated readily from the DNA/amino acid sequences. Therefore, no well-studied biological network knowledge is required for using ZUPLS. This significantly simplifies essential gene identification, especially for newly sequenced species. ZUPLS can also select necessary features automatically by embedding the uninformative variable elimination tool into the partial least squares classifier. No optimized modelling parameters are needed. ZUPLS has been used, herein, to predict essential genes of 12 remotely related prokaryotes to test its performance. The cross-organism predictions yielded AUC (Area Under the Curve) scores between 0.8042 and 0.9319 by using E. coli genes as the training samples. Similarly, ZUPLS achieved AUC scores between 0.8111 and 0.9371 by using B. subtilis genes as the training samples. We also compared it with the best available results of the existing approaches for further testing. The improvement of the AUC score in predicting B. subtilis essential genes using E. coli genes was 0.13. Additionally, in predicting E. coli essential genes using P. aeruginosa genes, the significant improvement was 0.10. Similarly, the exceptional improvement of the average accuracy of M. pulmonis using M. genitalium and M. pulmonis genes was 14.7%. The combined superior feature extraction and selection power of ZUPLS enable it to give reliable prediction of essential genes for both Gram-positive/negative organisms and rich/poor culture media.
机构:University of Maryland,Department of Computer Science and Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies
Carl Kingsford
Michael C Schatz
论文数: 0引用数: 0
h-index: 0
机构:University of Maryland,Department of Computer Science and Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies
Michael C Schatz
Mihai Pop
论文数: 0引用数: 0
h-index: 0
机构:University of Maryland,Department of Computer Science and Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies
机构:
Cincinnati Childrens Hosp Med Ctr, Div Biomed Informat, Cincinnati, OH USA
Univ Cincinnati, Dept Environm Hlth, Cincinnati, OH USACincinnati Childrens Hosp Med Ctr, Div Biomed Informat, Cincinnati, OH USA
Deng, Jingyuan
Su, Shengchang
论文数: 0引用数: 0
h-index: 0
机构:
Univ Cincinnati, Dept Biochem & Mol Genet, Cincinnati, OH USACincinnati Childrens Hosp Med Ctr, Div Biomed Informat, Cincinnati, OH USA
Su, Shengchang
Lin, Xiaodong
论文数: 0引用数: 0
h-index: 0
机构:
Rutgers State Univ, Dept Management Sci & Informat Syst, Piscataway, NJ USACincinnati Childrens Hosp Med Ctr, Div Biomed Informat, Cincinnati, OH USA
Lin, Xiaodong
Hassett, Daniel J.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Cincinnati, Dept Biochem & Mol Genet, Cincinnati, OH USACincinnati Childrens Hosp Med Ctr, Div Biomed Informat, Cincinnati, OH USA
Hassett, Daniel J.
Lu, Long Jason
论文数: 0引用数: 0
h-index: 0
机构:
Cincinnati Childrens Hosp Med Ctr, Div Biomed Informat, Cincinnati, OH USA
Univ Cincinnati, Dept Environm Hlth, Cincinnati, OH USA
Univ Cincinnati, Dept Comp Sci, Cincinnati, OH 45221 USACincinnati Childrens Hosp Med Ctr, Div Biomed Informat, Cincinnati, OH USA
机构:
NYU, Coll Arts & Sci, New York, NY 10012 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Pei, Anna Y.
Oberdorf, William E.
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Med, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Oberdorf, William E.
Nossa, Carlos W.
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Med, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Nossa, Carlos W.
Agarwal, Ankush
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Med, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Agarwal, Ankush
Chokshi, Pooja
论文数: 0引用数: 0
h-index: 0
机构:
Tufts Univ, Coll Arts & Sci, Medford, MA 02155 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Chokshi, Pooja
Gerz, Erika A.
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Med, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Gerz, Erika A.
Jin, Zhida
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Med, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Jin, Zhida
Lee, Peng
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Pathol, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Lee, Peng
Yang, Liying
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Pathol, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Yang, Liying
Poles, Michael
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Med, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Poles, Michael
Brown, Stuart M.
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Ctr Hlth Informat & Bioinformat, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Brown, Stuart M.
Sotero, Steven
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Ctr Hlth Informat & Bioinformat, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Sotero, Steven
DeSantis, Todd
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Berkeley, Lawrence Berkeley Lab, Div Earth Sci, Dept Ecol, Berkeley, CA 94720 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
DeSantis, Todd
Brodie, Eoin
论文数: 0引用数: 0
h-index: 0
机构:NYU, Sch Med, Dept Med, New York, NY 10016 USA
Brodie, Eoin
Nelson, Karen
论文数: 0引用数: 0
h-index: 0
机构:
J Craig Venter Inst, Rockville, MD 20850 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Nelson, Karen
Pei, Zhiheng
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Med, New York, NY 10016 USA
NYU, Sch Med, Dept Pathol, New York, NY 10016 USA
Dept Vet Affairs New York Harbor Healthcare Syst, New York, NY 10010 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
机构:
NYU, Coll Arts & Sci, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Pei, Anna
Li, Hongru
论文数: 0引用数: 0
h-index: 0
机构:
Mt Sinai Sch Med, Grad Sch Biol Sci, New York, NY USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Li, Hongru
Oberdorf, William E.
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Med, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Oberdorf, William E.
Alekseyenko, Alexander V.
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Med, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Alekseyenko, Alexander V.
Parsons, Tamasha
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Med, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Parsons, Tamasha
Yang, Liying
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Med, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Yang, Liying
Gerz, Erika A.
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Med, New York, NY 10016 USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Gerz, Erika A.
Lee, Peng
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Pathol, New York, NY 10016 USA
New York Harbor Healthcare Syst, Dept Vet Affairs, New York, NY USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Lee, Peng
Xiang, Charlie
论文数: 0引用数: 0
h-index: 0
机构:
Zhejiang Univ, Genome & Bioinformat Ctr, Coll Med, Hangzhou 310003, Zhejiang, Peoples R ChinaNYU, Sch Med, Dept Med, New York, NY 10016 USA
Xiang, Charlie
Nossa, Carlos W.
论文数: 0引用数: 0
h-index: 0
机构:
Rice Univ, Houston, TX USANYU, Sch Med, Dept Med, New York, NY 10016 USA
Nossa, Carlos W.
Pei, Zhiheng
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Sch Med, Dept Med, New York, NY 10016 USA
NYU, Sch Med, Dept Pathol, New York, NY 10016 USA
New York Harbor Healthcare Syst, Dept Vet Affairs, New York, NY USANYU, Sch Med, Dept Med, New York, NY 10016 USA