共 50 条
Improving peak detection in high-resolution LC/MS metabolomics data using preexisting knowledge and machine learning approach
被引:39
|作者:
Yu, Tianwei
[1
]
Jones, Dean P.
[2
]
机构:
[1] Emory Univ, Rollins Sch Publ Hlth, Dept Biostat & Bioinformat, Atlanta, GA 30322 USA
[2] Emory Univ, Sch Med, Dept Med, Atlanta, GA 30322 USA
关键词:
SPECTROMETRY DATA;
MASS;
EXTRACTION;
ALIGNMENT;
D O I:
10.1093/bioinformatics/btu430
中图分类号:
Q5 [生物化学];
学科分类号:
071010 ;
081704 ;
摘要:
Motivation: Peak detection is a key step in the preprocessing of untargeted metabolomics data generated from high-resolution liquid chromatography-mass spectrometry (LC/MS). The common practice is to use filters with predetermined parameters to select peaks in the LC/MS profile. This rigid approach can cause suboptimal performance when the choice of peak model and parameters do not suit the data characteristics. Results: Here we present a method that learns directly from various data features of the extracted ion chromatograms (EICs) to differentiate between true peak regions from noise regions in the LC/MS profile. It utilizes the knowledge of known metabolites, as well as robust machine learning approaches. Unlike currently available methods, this new approach does not assume a parametric peak shape model and allows maximum flexibility. We demonstrate the superiority of the new approach using real data. Because matching to known metabolites entails uncertainties and cannot be considered a gold standard, we also developed a probabilistic receiver-operating characteristic (pROC) approach that can incorporate uncertainties.
引用
收藏
页码:2941 / 2948
页数:8
相关论文