metabolomics;
NMR;
LASSO;
zero-sum;
normalization;
scaling;
ACUTE KIDNEY INJURY;
DATA NORMALIZATION;
CARDIAC-SURGERY;
DATA SETS;
NMR;
METABOLOMICS;
REGRESSION;
SELECTION;
DISEASE;
MODEL;
D O I:
10.1021/acs.jproteome.7b00325
中图分类号:
Q5 [生物化学];
学科分类号:
071010 ;
081704 ;
摘要:
Metabolomics data is typically scaled to a common reference like a constant volume of body fluid, a constant creatinine level, or a constant area under the spectrum. Such scaling of the data, however, may affect the selection of biomarkers and the biological interpretation of results in unforeseen ways. Here, we studied how both the outcome of hypothesis tests for differential metabolite concentration and the screening for multivariate metabolite signatures are affected by the choice of scale. To overcome this problem for metabolite signatures and to establish a scale-invariant biomarker discovery algorithm, we extended linear zero-sum regression to the logistic regression framework and showed in two applications to H-1 NMR-based metabolomics data how this approach overcomes the scaling problem. Logistic zero-sum regression is available as an R package as well as a high-performance computing implementation that can be downloaded at https://github.com/rehbergT/zeroSum.
机构:
Tampere Univ Technol, Tampere Int Ctr Signal Proc, FIN-33101 Tampere, FinlandTampere Univ Technol, Tampere Int Ctr Signal Proc, FIN-33101 Tampere, Finland