Application of High-Dimensional Feature Selection in Near-Infrared Spectroscopy of Cigarettes' Qualitative Evaluation

被引:7
|
作者
Qin Yuhua [1 ,2 ]
Ding Xiangqian [2 ]
Gong Huili [2 ]
机构
[1] Qingdao Univ Sci & Technol, Coll Informat Sci & Technol, Qingdao, Peoples R China
[2] China Ocean Univ, Coll Informat Sci & Engn, Qingdao, Peoples R China
关键词
cigarettes' NIR spectra; high-dimensional feature selection; principal component analysis (PCA); random forest feature importance measure (RFFIM); PREDICTION; PROJECTIONS;
D O I
10.1080/00387010.2012.746373
中图分类号
O433 [光谱学];
学科分类号
0703 ; 070302 ;
摘要
In order to increase the classification accuracy, a new feature selection method, RFFIM-PCA, based on the random forest feature importance measure (RFFIM) and principal component analysis (PCA) for analyzing the near-infrared (NIR) spectra of tobacco, is presented in this paper. We applied the method to the classification of cigarettes' qualitative evaluation and also compared it with other methods. The result showed that RFFIM-PCA discriminates the high-dimensional data effectively and can be used to identify the cigarettes' quality. The feature selection filters the noises, while PCA eliminates the redundant features and reduces the dimensionalities as well. The experimental results showed that RFFIM-PCA successfully eliminated the noises and redundant features in high-dimensional data, leading to a promising improvement on the feature selection and classification accuracy.
引用
收藏
页码:397 / 402
页数:6
相关论文
共 50 条
  • [21] Feature selection for high-dimensional temporal data
    Tsagris, Michail
    Lagani, Vincenzo
    Tsamardinos, Ioannis
    BMC BIOINFORMATICS, 2018, 19
  • [22] Feature selection for high-dimensional temporal data
    Michail Tsagris
    Vincenzo Lagani
    Ioannis Tsamardinos
    BMC Bioinformatics, 19
  • [23] Feature Selection with High-Dimensional Imbalanced Data
    Van Hulse, Jason
    Khoshgoftaar, Taghi M.
    Napolitano, Amri
    Wald, Randall
    2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 507 - 514
  • [24] High-dimensional feature selection for genomic datasets
    Afshar, Majid
    Usefi, Hamid
    KNOWLEDGE-BASED SYSTEMS, 2020, 206
  • [25] Qualitative and quantitative analysis of oxytetracycline by near-infrared spectroscopy
    Smola, N
    Urleb, U
    ANALYTICA CHIMICA ACTA, 2000, 410 (1-2) : 203 - 210
  • [26] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
    Verleysen, Michel
    ECTA 2011/FCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON EVOLUTIONARY COMPUTATION THEORY AND APPLICATIONS AND INTERNATIONAL CONFERENCE ON FUZZY COMPUTATION THEORY AND APPLICATIONS, 2011,
  • [27] Comparison of chemometric methods for brand classification of cigarettes by near-infrared spectroscopy
    Tan, Chao
    Qin, Xin
    Li, Menglong
    VIBRATIONAL SPECTROSCOPY, 2009, 51 (02) : 276 - 282
  • [28] Application of two-dimensional near-infrared correlation spectroscopy to protein research
    Ozaki, Y
    Murayama, K
    Wang, Y
    VIBRATIONAL SPECTROSCOPY, 1999, 20 (02) : 127 - 132
  • [29] Application of near-infrared spectroscopy to wood discrimination
    Tsuchikawa, S
    Inoue, K
    Noma, J
    Hayashi, K
    JOURNAL OF WOOD SCIENCE, 2003, 49 (01) : 29 - 35
  • [30] The application of correlation detection to near-infrared spectroscopy
    Liu, QG
    Shao, DR
    Li, SJ
    SPECTROSCOPY AND SPECTRAL ANALYSIS, 2005, 25 (12) : 1978 - 1981