Classification of gasoline data obtained by gas chromatography using a piecewise alignment algorithm combined with feature selection and principal component analysis

被引:138
|
作者
Pierce, KM
Hope, JL
Johnson, KJ
Wright, BW
Synovec, RE
机构
[1] Univ Washington, Dept Chem, Seattle, WA 98195 USA
[2] Pacific NW Natl Lab, Richland, WA 99352 USA
关键词
alignment; gas chromatography; feature selection; principal component analysis; ANOVA; fuel; chemometrics;
D O I
10.1016/j.chroma.2005.04.078
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
A fast and objective chemometric classification method is developed and applied to the analysis of gas chromatography (GC) data from five commercial gasoline samples. The gasoline samples serve as model mixtures, whereas the focus is on the development and demonstration of the classification method. The method is based on objective retention time alignment (referred to as piecewise alignment) coupled with analysis of variance (ANOVA) feature selection prior to classification by principal component analysis (PCA) using optimal parameters. The degree-of-class-separation is used as a metric to objectively optimize the alignment and feature selection parameters using a suitable training set thereby reducing user subjectivity, as well as to indicate the success of the PCA clustering and classification. The degree-of-class-separation is calculated using Euclidean distances between the PCA scores of a subset of the replicate runs from two of the five fuel types, i.e., the training set. The unaligned training set that was directly submitted to PCA had a low degree-of-class-separation (0.4), and the PCA scores plot for the raw training set combined with the raw test set failed to correctly cluster the five sample types. After submitting the training set to piecewise alignment, the degree-of-class-separation increased (1.2), but when the same alignment parameters were applied to the training set combined with the test set, the scores plot clustering still did not yield five distinct groups. Applying feature selection to the unaligned training set increased the degree-of-class-separation (4.8), but chemical variations were still obscured by retention time variation and when the same feature selection conditions were used for the training set combined with the test set, only one of the five fuels was clustered correctly. However, piecewise alignment coupled with feature selection yielded a reasonably optimal degree-of-class-separation for the training set (9.2). and when the same alignment and ANOVA parameters were applied to the training set combined with the test set, the PCA scores plot correctly classified the gasoline fingerprints into five distinct clusters. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:101 / 110
页数:10
相关论文
共 50 条
  • [1] Classification of high-speed gas chromatography-mass spectrometry data by principal component analysis coupled with piecewise alignment and feature selection
    Watson, Nathanial E.
    VanWingerden, Matthew M.
    Pierce, Karisa M.
    Wright, Bob W.
    Synovec, Robert E.
    JOURNAL OF CHROMATOGRAPHY A, 2006, 1129 (01) : 111 - 118
  • [2] Feature selection using principal component analysis and genetic algorithm
    Adhao, Rahul
    Pachghare, Vinod
    JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2020, 23 (02): : 595 - 602
  • [3] Feature Selection for Classification using Principal Component Analysis and Information Gain
    Omuya, Erick Odhiambo
    Okeyo, George Onyango
    Kimwele, Michael Waema
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 174
  • [4] Feature selection in principal component analysis of analytical data
    Guo, Q
    Wu, W
    Massart, DL
    Boucon, C
    de Jong, S
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2002, 61 (1-2) : 123 - 132
  • [5] Supervised feature selection using principal component analysis
    Rahmat, Fariq
    Zulkafli, Zed
    Ishak, Asnor Juraiza
    Rahman, Ribhan Zafira Abdul
    De Stercke, Simon
    Buytaert, Wouter
    Tahir, Wardah
    Ab Rahman, Jamalludin
    Ibrahim, Salwa
    Ismail, Muhamad
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (03) : 1955 - 1995
  • [6] Supervised feature selection using principal component analysis
    Fariq Rahmat
    Zed Zulkafli
    Asnor Juraiza Ishak
    Ribhan Zafira Abdul Rahman
    Simon De Stercke
    Wouter Buytaert
    Wardah Tahir
    Jamalludin Ab Rahman
    Salwa Ibrahim
    Muhamad Ismail
    Knowledge and Information Systems, 2024, 66 : 1955 - 1995
  • [7] Feature Selection of Weather Data with Interval Principal Component Analysis
    He, Chong-Cheng
    Jeng, Jin-Tsong
    2016 INTERNATIONAL CONFERENCE ON SYSTEM SCIENCE AND ENGINEERING (ICSSE), 2016,
  • [8] Feature Selection Algorithm for Motor Quality Types Using Weighted Principal Component Analysis
    Yeh, Yun-Chi
    Lin, Liuh-Chii
    Liu, Mei-Chen
    Chu, Tsui-Shiun
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT TECHNOLOGIES AND ENGINEERING SYSTEMS (ICITES2014), 2016, 345 : 151 - 157
  • [9] Study of the interdependency of the data sampling ratio with retention time alignment and principal component analysis for gas chromatography
    Nadeau, Jeremy S.
    Wilson, Ryan B.
    Hoggard, Jamin C.
    Wright, Bob W.
    Synovec, Robert E.
    JOURNAL OF CHROMATOGRAPHY A, 2011, 1218 (50) : 9091 - 9101
  • [10] Effective Data Reduction Using Discriminative Feature Selection Based on Principal Component Analysis
    Nwokoma, Faith
    Foreman, Justin
    Akujuobi, Cajetan M.
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (02): : 789 - 799