Deep learning for peptide identification from metaproteomics datasets

被引:9
|
作者
Feng, Shichao [1 ]
Sterzenbach, Ryan [2 ]
Guo, Xuan [1 ]
机构
[1] Univ North Texas, Dept Comp Sci & Engn, 3940 N Elm St,Ste F290, Denton, TX 76207 USA
[2] Univ North Texas, Dept Biomed Engn, Denton, TX 76203 USA
基金
美国国家卫生研究院;
关键词
Peptide identification; Deep learning; Tandem mass spectrometry; CNN; PROTEIN IDENTIFICATION; STATISTICAL-MODEL; MS/MS; CONFIDENCE; CHALLENGES; REVEALS;
D O I
10.1016/j.jprot.2021.104316
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Metaproteomics is becoming widely used in microbiome research for gaining insights into the functional state of the microbial community. Current metaproteomics studies are generally based on high-throughput tandem mass spectrometry (MS/MS) coupled with liquid chromatography. In this paper, we proposed a deep-learningbased algorithm, named DeepFilter, for improving peptide identifications from a collection of tandem mass spectra. The key advantage of the DeepFilter is that it does not need ad hoc training or fine-tuning as in existing filtering tools. DeepFilter is freely available under the GNU GPL license at https://github. com/Biocomputing-Research-Group/DeepFilter. Significance: The identification of peptides and proteins from MS data involves the computational procedure of searching MS/MS spectra against a predefined protein sequence database and assigning top-scored peptides to spectra. Existing computational tools are still far from being able to extract all the information out of MS/MS data sets acquired from metaproteome samples. Systematical experiment results demonstrate that the DeepFilter identified up to 12% and 9% more peptide-spectrum-matches and proteins, respectively, compared with existing filtering algorithms, including Percolator, Q-ranker, PeptideProphet, and iProphet, on marine and soil microbial metaproteome samples with false discovery rate at 1%. The taxonomic analysis shows that DeepFilter found up to 7%, 10%, and 14% more species from marine, soil, and human gut samples compared with existing filtering algorithms. Therefore, DeepFilter was believed to generalize properly to new, previously unseen peptidespectrum-matches and can be readily applied in peptide identification from metaproteomics data.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Deep learning using tumor HLA peptide mass spectrometry datasets improves neoantigen identification
    Bulik-Sullivan, Brendan
    Busby, Jennifer
    Palmer, Christine D.
    Davis, Matthew J.
    Murphy, Tyler
    Clark, Andrew
    Busby, Michele
    Duke, Fujiko
    Yang, Aaron
    Young, Lauren
    Ojo, Noelle C.
    Caldwell, Kamilah
    Abhyankar, Jesse
    Boucher, Thomas
    Hart, Meghan G.
    Makarov, Vladimir
    De Montpreville, Vincent Thomas
    Mercier, Olaf
    Chan, Timothy A.
    Scagliotti, Giorgio
    Bironzo, Paolo
    Novello, Silvia
    Karachaliou, Niki
    Rosell, Rafael
    Anderson, Ian
    Gabrail, Nashat
    Hrom, John
    Limvarapuss, Chainarong
    Choquette, Karin
    Spira, Alexander
    Rousseau, Raphael
    Voong, Cynthia
    Rizvi, Naiyer A.
    Fadel, Elie
    Frattini, Mark
    Jooss, Karin
    Skoberne, Mojca
    Francis, Joshua
    Yelensky, Roman
    NATURE BIOTECHNOLOGY, 2019, 37 (01) : 55 - +
  • [2] Deep learning using tumor HLA peptide mass spectrometry datasets improves neoantigen identification
    Brendan Bulik-Sullivan
    Jennifer Busby
    Christine D Palmer
    Matthew J Davis
    Tyler Murphy
    Andrew Clark
    Michele Busby
    Fujiko Duke
    Aaron Yang
    Lauren Young
    Noelle C Ojo
    Kamilah Caldwell
    Jesse Abhyankar
    Thomas Boucher
    Meghan G Hart
    Vladimir Makarov
    Vincent Thomas De Montpreville
    Olaf Mercier
    Timothy A Chan
    Giorgio Scagliotti
    Paolo Bironzo
    Silvia Novello
    Niki Karachaliou
    Rafael Rosell
    Ian Anderson
    Nashat Gabrail
    John Hrom
    Chainarong Limvarapuss
    Karin Choquette
    Alexander Spira
    Raphael Rousseau
    Cynthia Voong
    Naiyer A Rizvi
    Elie Fadel
    Mark Frattini
    Karin Jooss
    Mojca Skoberne
    Joshua Francis
    Roman Yelensky
    Nature Biotechnology, 2019, 37 : 55 - 63
  • [3] Semi-supervised learning for peptide identification from shotgun proteomics datasets
    Lukas Käll
    Jesse D Canterbury
    Jason Weston
    William Stafford Noble
    Michael J MacCoss
    Nature Methods, 2007, 4 : 923 - 925
  • [4] Semi-supervised learning for peptide identification from shotgun proteomics datasets
    Kall, Lukas
    Canterbury, Jesse D.
    Weston, Jason
    Noble, William Stafford
    MacCoss, Michael J.
    NATURE METHODS, 2007, 4 (11) : 923 - 925
  • [5] Deep Learning and Machine Learning Techniques Applied to Speaker Identification on Small Datasets
    Manfron, Enrico
    Teixeira, Joao Paulo
    Minetto, Rodrigo
    OPTIMIZATION, LEARNING ALGORITHMS AND APPLICATIONS, PT II, OL2A 2023, 2024, 1982 : 195 - 210
  • [6] Unifying antimicrobial peptide datasets for robust deep learning-based classification
    Peng, Shuang
    Rajjou, Loic
    DATA IN BRIEF, 2024, 56
  • [7] DeepRescore: Leveraging Deep Learning to Improve Peptide Identification in Immunopeptidomics
    Li, Kai
    Jain, Antrix
    Malovannaya, Anna
    Wen, Bo
    Zhang, Bing
    PROTEOMICS, 2020, 20 (21-22)
  • [8] SpecEncoder: deep metric learning for accurate peptide identification in proteomics
    Liu, Kaiyuan
    Tao, Chenghua
    Ye, Yuzhen
    Tang, Haixu
    BIOINFORMATICS, 2024, 40 : i257 - i265
  • [9] Applying Deep Learning for Wildfire Identification: Economical and Accessible Solutions Leveraging Small Datasets
    Shrivastava, Aarav M.
    Shrivastava, Manish
    ATMOSPHERE, 2025, 16 (02)
  • [10] Regularization Learning Networks: Deep Learning for Tabular Datasets
    Shavitt, Ira
    Segal, Eran
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31