An efficient variable selection method based on variable permutation and model population analysis for multivariate calibration of NIR spectra

被引:29
|
作者
Bin, Jun [1 ]
Ai, Fangfang [2 ]
Fan, Wei [1 ]
Zhou, Jiheng [1 ]
Li, Xin [1 ]
Tang, Wenxian [3 ]
Liang, Yizeng [3 ]
机构
[1] Hunan Agr Univ, Coll Biosci & Biotechnol, Changsha, Hunan, Peoples R China
[2] Shanghai Tobacco Grp Co Ltd, Shanghai, Peoples R China
[3] Cent S Univ, Coll Chem & Chem Engn, Changsha, Hunan, Peoples R China
关键词
Variable selection; Partial least squares; Variable permutation population analysis; Model population analysis; Exponentially decreasing function; Multivariate spectral calibration; WAVELENGTH INTERVAL SELECTION; LEAST-SQUARES REGRESSION; GENETIC ALGORITHMS; RANDOM FROG; SPECTROSCOPY; ELIMINATION; PLS;
D O I
10.1016/j.chemolab.2016.08.006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Variable selection plays a pivotal role in the quantitative analysis of near-infrared (NIR) spectra with large number of variables and relatively few samples. In this study, a novel algorithm, namely variable permutation population analysis (VPPA) which combines variable permutation, model population analysis (MPA) and exponentially decreasing function (EDF), was proposed for variable selection to improve the prediction performance in multivariate spectral calibration. This method builds a large number of sub-datasets by Monte Carlo sampling (MCS) strategy in both sample space and variable space firstly, and the importance of each variable is subsequently evaluated using the difference value order of the corresponding partial least squares (PLS) model prediction error before and after the variable permutation. Next, EDF is applied to eliminate the relatively uninformative variables by force. Ultimately, cross validation is utilized to choose the optimal variable subset. A complete methodology for variable selection is constructed through the above four procedures. Three near infrared (NIR) datasets were presented to illustrate the proposed method and evaluate its performance. While PLS is used as the modeling method, the results reveal that VPPA is a potential variable selection method which shows better prediction performance when compared with conventional PLS, subwindow permutation analysis PIS (SPA-PLS), Monte Carlo uninformative variable elimination by PLS (MC-UVE-PLS), competitive adaptive reweighted sampling PLS (CARS-PLS) and genetic algorithm PLS (GA- PIS). Moreover, the proposed approach employs fewer variables than these variable optimization methods mentioned above. Therefore, the VPPA technique can be recommended for practical implementation in multivariate calibration of NIR spectra. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 13
页数:13
相关论文
共 50 条
  • [31] A near infrared wavelength selection method based on the variable stability and population analysis
    Zhang Feng
    Tang Xiao-Jun
    Tong Ang-Xin
    Wang Bin
    Wang Jing-Wei
    JOURNAL OF INFRARED AND MILLIMETER WAVES, 2020, 39 (03) : 318 - 323
  • [32] A GPU-Based Implementation of the Firefly Algorithm for Variable Selection in Multivariate Calibration Problems
    de Paula, Lauro C. M.
    Soares, Anderson S.
    de Lima, Telma W.
    Delbem, Alexandre C. B.
    Coelho, Clarimar J.
    Filho, Arlindo R. G.
    PLOS ONE, 2014, 9 (12):
  • [33] ON THE USE OF ELEMENTAL ANALYSIS IN MULTIVARIATE VARIABLE SELECTION
    HINTZE, JL
    TECHNOMETRICS, 1980, 22 (04) : 609 - 612
  • [34] COMPARISON OF VARIABLE SELECTION AND REGRESSION METHODS IN MULTIVARIATE CALIBRATION OF A PROCESS ANALYZER
    HEIKKA, R
    MINKKINEN, P
    TAAVITSAINEN, VM
    PROCESS CONTROL AND QUALITY, 1994, 6 (01): : 47 - 54
  • [35] Sparse envelope model: efficient estimation and response variable selection in multivariate linear regression
    Su, Z.
    Zhu, G.
    Chen, X.
    Yang, Y.
    BIOMETRIKA, 2016, 103 (03) : 579 - 593
  • [36] Development of a univariate calibration model for pharmaceutical analysis based on NIR spectra
    Blanco, M.
    Cruz, J.
    Bautista, M.
    ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2008, 392 (7-8) : 1367 - 1372
  • [37] Development of a univariate calibration model for pharmaceutical analysis based on NIR spectra
    M. Blanco
    J. Cruz
    M. Bautista
    Analytical and Bioanalytical Chemistry, 2008, 392 : 1367 - 1372
  • [38] Variable contribution analysis in multivariate process monitoring using permutation entropy
    Obanya, Praise Otito
    Coetzer, Roelof L. J.
    Olivier, Carel Petrus
    Verster, Tanja
    COMPUTERS & INDUSTRIAL ENGINEERING, 2024, 190
  • [39] Epistasis-based FSA: Two versions of a novel approach for variable selection in multivariate calibration
    de Paula, Lauro C. M.
    Soares, Anderson S.
    Soares, Telma W.
    Junior, Celso G. C.
    Coelho, Clarimar J.
    de Oliveira, Anselmo E.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 81 : 213 - 222
  • [40] PLS pruning: a new approach to variable selection for multivariate calibration based on Hessian matrix of errors
    Lima, SLT
    Mello, C
    Poppi, RJ
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2005, 76 (01) : 73 - 78