Analysis of metabolomic data using support vector machines

被引:286
|
作者
Mahadevan, Sankar [2 ]
Shah, Sirish L. [2 ]
Marrie, Thomas J. [1 ]
Slupsky, Carolyn M. [1 ]
机构
[1] Univ Alberta, Dept Med, Edmonton, AB, Canada
[2] Univ Alberta, Dept Chem & Mat Engn, Edmonton, AB, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.1021/ac800954c
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Metabolomics is an emerging field providing insight into physiological processes. It is an effective tool to investigate disease diagnosis or conduct toxicological studies by observing changes in metabolite concentrations in various biofluids. Multivariate statistical analysis is generally employed with nuclear magnetic resonance (NMR) or mass spectrometry (MS) data to determine differences between groups (for instance diseased vs healthy). Characteristic predictive models may be built based on a set of training data, and these models are subsequently used to predict whether new test data falls under a specific class. In this study, metabolomic data is obtained by doing a H-1 NMR spectroscopy on urine samples obtained from healthy subjects (male and female) and patients suffering from Streptococcus pneumoniae. We compare the performance of traditional PLS-DA multivariate analysis to support vector machines (SVMs), a technique widely used in genome studies on two case studies: (1) a case where nearly complete distinction may be seen (healthy versus pneumonia) and (2) a case where distinction is more ambiguous (male versus female). We show that SVMs are superior to PLS-DA in both cases in terms of predictive accuracy with the least number of features. With fewer number of features, SVMs are able to give better predictive model when compared to that of PLS-DA.
引用
收藏
页码:7562 / 7570
页数:9
相关论文
共 50 条
  • [31] Aerial LiDAR data classification using Support Vector Machines (SVM)
    Lodha, Suresh K.
    Kreps, Edward J.
    Helmbold, David P.
    Fitzpatrick, Darren
    THIRD INTERNATIONAL SYMPOSIUM ON 3D DATA PROCESSING, VISUALIZATION, AND TRANSMISSION, PROCEEDINGS, 2007, : 567 - 574
  • [32] Anomaly Detection Using Support Vector Machines for Time Series Data
    Yokkampon, Umaporn
    Chumkamon, Sakmongkon
    Mowshowitz, Abbe
    Fujisawa, Ryusuke
    Hayashi, Eiji
    JOURNAL OF ROBOTICS NETWORKING AND ARTIFICIAL LIFE, 2021, 8 (01): : 41 - 46
  • [33] Using the Leader Algorithm with Support Vector Machines for Large Data Sets
    Romero, Enrique
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2011, PT I, 2011, 6791 : 225 - 232
  • [34] Data-Driven Fault Classification Using Support Vector Machines
    Jallepalli, Deepthi
    Kakhki, Fatemeh Davoudi
    INTELLIGENT HUMAN SYSTEMS INTEGRATION 2021, 2021, 1322 : 316 - 322
  • [35] Support Vector Machines Training Data Selection Using a Genetic Algorithm
    Kawulok, Michal
    Nalepa, Jakub
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2012, 7626 : 557 - 565
  • [36] Fault Detection and Diagnosis in Process Data Using Support Vector Machines
    Wu, Fang
    Yin, Shen
    Karimi, Hamid Reza
    JOURNAL OF APPLIED MATHEMATICS, 2014,
  • [37] Biological Data Classification Using Rough Sets and Support Vector Machines
    Zhao, Yanjun
    Zhang, Yanqing
    Xiong, Naixue
    2009 ANNUAL MEETING OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY, 2009, : 344 - 349
  • [38] A note on classification of gene expression data using support vector machines
    Fujarewicz, K
    Kimmel, M
    Rzeszowska-Wolny, J
    Swierniak, A
    JOURNAL OF BIOLOGICAL SYSTEMS, 2003, 11 (01) : 43 - 56
  • [39] Transcription factor discovery using support vector machines and heterogeneous data
    Barbe, Jose F.
    Tewfik, Ahmed H.
    Khodursky, Arkady B.
    2007 IEEE INTERNATIONAL WORKSHOP ON GENOMIC SIGNAL PROCESSING AND STATISTICS, 2007, : 27 - +
  • [40] Sedation Evaluation Based on Support Vector Machines Using IMU Data
    Ye, Jianping
    Wang, Tao
    Huang, Xiaoxia
    Gu, Yu
    Wang, Zhikang
    Chu, Yonghua
    Huang, Tianhai
    Liu, Tao
    IEEE SENSORS JOURNAL, 2024, 24 (11) : 17478 - 17485