Using variable combination population analysis for variable selection in multivariate calibration

被引:188
|
作者
Yun, Yong-Huan [1 ]
Wang, Wei-Ting [1 ]
Deng, Bai-Chuan [1 ,5 ]
Lai, Guang-Bi [2 ]
Liu, Xin-bo [1 ]
Ren, Da-Bing [1 ]
Liang, Yi-Zeng [1 ]
Fan, Wei [3 ]
Xu, Qing-Song [4 ]
机构
[1] Cent South Univ, Coll Chem & Chem Engn, Changsha 410083, Hunan, Peoples R China
[2] Heilongjiang Univ Chinese Med, Heilongjiang 150040, Haerbin, Peoples R China
[3] Hunan Agr Univ, Coll Biosci & Biotechnol, Joint Lab Biol Qual & Safety, Changsha 410128, Hunan, Peoples R China
[4] Cent South Univ, Sch Math & Stat, Changsha 410083, Hunan, Peoples R China
[5] Univ Bergen, Dept Chem, N-5007 Bergen, Norway
关键词
Partial least squares; Variable combination; Variable selection; Model population analysis; Exponentially decreasing function; Multivariate calibration; GENETIC ALGORITHM-PLS; WAVELENGTH SELECTION; SPECTRAL DATA; RANDOM FROG; REGRESSION; ELIMINATION; OPTIMIZATION; PERSPECTIVE; TOOL;
D O I
10.1016/j.aca.2014.12.048
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Variable (wavelength or feature) selection techniques have become a critical step for the analysis of datasets with high number of variables and relatively few samples. In this study, a novel variable selection strategy, variable combination population analysis (VCPA), was proposed. This strategy consists of two crucial procedures. First, the exponentially decreasing function (EDF), which is the simple and effective principle of 'survival of the fittest' from Darwin's natural evolution theory, is employed to determine the number of variables to keep and continuously shrink the variable space. Second, in each EDF run, binary matrix sampling (BMS) strategy that gives each variable the same chance to be selected and generates different variable combinations, is used to produce a population of subsets to construct a population of sub-models. Then, model population analysis (MPA) is employed to find the variable subsets with the lower root mean squares error of cross validation (RMSECV). The frequency of each variable appearing in the best 10% sub-models is computed. The higher the frequency is, the more important the variable is. The performance of the proposed procedure was investigated using three real NIR datasets. The results indicate that VCPA is a good variable selection strategy when compared with four high performing variable selection methods: genetic algorithm-partial least squares (GA-PLS), Monte Carlo uninformative variable elimination by PLS (MC-UVE-PLS), competitive adaptive reweighted sampling (CARS) and iteratively retains informative variables (IRIV). The MATLAB source code of VCPA is available for academic research on the website: http://www.mathworks.com/matlabcentral/fileexchange/authors/498750. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:14 / 23
页数:10
相关论文
共 50 条
  • [1] An efficient variable selection method based on variable permutation and model population analysis for multivariate calibration of NIR spectra
    Bin, Jun
    Ai, Fangfang
    Fan, Wei
    Zhou, Jiheng
    Li, Xin
    Tang, Wenxian
    Liang, Yizeng
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2016, 158 : 1 - 13
  • [2] Variable interaction network based variable selection for multivariate calibration
    Rao, Raghuraj
    Lakshminarayanan, S.
    ANALYTICA CHIMICA ACTA, 2007, 599 (01) : 24 - 35
  • [3] Variable selection in multivariate calibration based on clustering of variable concept
    Farrokhnia, Maryam
    Karimi, Sadegh
    ANALYTICA CHIMICA ACTA, 2016, 902 : 70 - 81
  • [4] Variable Selection and Reduction in Multivariate Calibration and Modelling
    Vander Heyden, Yvan
    Andries, Jan P. M.
    Goodarzi, Mohammad
    LC GC EUROPE, 2011, 24 (12) : 642 - 644
  • [5] Variable selection for neural networks in multivariate calibration
    Despagne, F
    Massart, DL
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1998, 40 (02) : 145 - 163
  • [6] A novel variable selection method based on stability and variable permutation for multivariate calibration
    Chen, Junming
    Yang, Chunhua
    Zhu, Hongqiu
    Li, Yonggang
    Gui, Weihua
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2018, 182 : 188 - 201
  • [7] Multivariate Calibration Transfer Employing Variable Selection and Subagging
    Martins, Marcelo N.
    Galvao, Roberto K. H.
    Pimentel, Maria Fernanda
    JOURNAL OF THE BRAZILIAN CHEMICAL SOCIETY, 2010, 21 (01) : 127 - U57
  • [8] Variable selection in multivariate calibration of a spectroscopic glucose sensor
    McShane, MJ
    Cote, GL
    Spiegelman, C
    APPLIED SPECTROSCOPY, 1997, 51 (10) : 1559 - 1564
  • [9] Multiobjective Firefly Algorithm for Variable Selection in Multivariate Calibration
    Martins de Paula, Lauro Cassio
    Soares, Anderson da Silva
    PROGRESS IN ARTIFICIAL INTELLIGENCE-BK, 2015, 9273 : 274 - 279
  • [10] A hybrid variable selection strategy based on continuous shrinkage of variable space in multivariate calibration
    Yun, Yong-Huan
    Bin, Jun
    Liu, Dong-Li
    Xu, Lin
    Yan, Ting-Liang
    Cao, Dong-Sheng
    Xu, Qing-Song
    ANALYTICA CHIMICA ACTA, 2019, 1058 : 58 - 69