Improving peak detection in high-resolution LC/MS metabolomics data using preexisting knowledge and machine learning approach

被引:39
|
作者
Yu, Tianwei [1 ]
Jones, Dean P. [2 ]
机构
[1] Emory Univ, Rollins Sch Publ Hlth, Dept Biostat & Bioinformat, Atlanta, GA 30322 USA
[2] Emory Univ, Sch Med, Dept Med, Atlanta, GA 30322 USA
关键词
SPECTROMETRY DATA; MASS; EXTRACTION; ALIGNMENT;
D O I
10.1093/bioinformatics/btu430
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Peak detection is a key step in the preprocessing of untargeted metabolomics data generated from high-resolution liquid chromatography-mass spectrometry (LC/MS). The common practice is to use filters with predetermined parameters to select peaks in the LC/MS profile. This rigid approach can cause suboptimal performance when the choice of peak model and parameters do not suit the data characteristics. Results: Here we present a method that learns directly from various data features of the extracted ion chromatograms (EICs) to differentiate between true peak regions from noise regions in the LC/MS profile. It utilizes the knowledge of known metabolites, as well as robust machine learning approaches. Unlike currently available methods, this new approach does not assume a parametric peak shape model and allows maximum flexibility. We demonstrate the superiority of the new approach using real data. Because matching to known metabolites entails uncertainties and cannot be considered a gold standard, we also developed a probabilistic receiver-operating characteristic (pROC) approach that can incorporate uncertainties.
引用
收藏
页码:2941 / 2948
页数:8
相关论文
共 50 条
  • [1] Deep Learning for the Precise Peak Detection in High-Resolution LC-MS Data
    Melnikov, Arsenty D.
    Tsentalovich, Yuri P.
    Yanshole, Vadim V.
    ANALYTICAL CHEMISTRY, 2020, 92 (01) : 588 - 592
  • [2] Hybrid Feature Detection and Information Accumulation Using High-Resolution LC-MS Metabolomics Data
    Yu, Tianwei
    Park, Youngja
    Li, Shuzhao
    Jones, Dean P.
    JOURNAL OF PROTEOME RESEARCH, 2013, 12 (03) : 1419 - 1427
  • [3] Second-order peak detection for multicomponent high-resolution LC/MS data
    Stolt, R
    Torgrip, RJO
    Lindberg, J
    Csenki, L
    Kolmert, J
    Schuppe-Koistinen, I
    Jacobsson, SP
    ANALYTICAL CHEMISTRY, 2006, 78 (04) : 975 - 983
  • [4] Simple data-reduction method for high-resolution LC-MS data in metabolomics
    Scheltema, R. A.
    Decuypere, S.
    Dujardin, J. C.
    Watson, D. G.
    Jansen, R. C.
    Breitling, R.
    BIOANALYSIS, 2009, 1 (09) : 1551 - 1557
  • [5] MetaClean: a machine learning-based classifier for reduced false positive peak detection in untargeted LC–MS metabolomics data
    Kelsey Chetnik
    Lauren Petrick
    Gaurav Pandey
    Metabolomics, 2020, 16
  • [6] MetaClean: a machine learning-based classifier for reduced false positive peak detection in untargeted LC-MS metabolomics data
    Chetnik, Kelsey
    Petrick, Lauren
    Pandey, Gaurav
    METABOLOMICS, 2020, 16 (11)
  • [7] Ion Fusion of High-Resolution LC MS-Based Metabolomics Data to Discover More Reliable Biomarkers
    Zeng, Zhongda
    Liu, Xinyu
    Dai, Weidong
    Yin, Peiyuan
    Zhou, Lina
    Huang, Qiang
    Lin, Xiaohui
    Xu, Guowang
    ANALYTICAL CHEMISTRY, 2014, 86 (08) : 3793 - 3800
  • [8] MetCirc: navigating mass spectral similarity in high-resolution MS/MS metabolomics data
    Naake, Thomas
    Gaquerel, Emmanuel
    BIOINFORMATICS, 2017, 33 (15) : 2419 - 2420
  • [9] apLCMS-adaptive processing of high-resolution LC/MS data
    Yu, Tianwei
    Park, Youngja
    Johnson, Jennifer M.
    Jones, Dean P.
    BIOINFORMATICS, 2009, 25 (15) : 1930 - 1936
  • [10] Data acquisition and data mining techniques for metabolite identification using LC coupled to high-resolution MS
    Ma, Shuguang
    Chowdhury, Swapan K.
    BIOANALYSIS, 2013, 5 (10) : 1285 - 1297