An efficient variable selection method based on variable permutation and model population analysis for multivariate calibration of NIR spectra

被引:29
|
作者
Bin, Jun [1 ]
Ai, Fangfang [2 ]
Fan, Wei [1 ]
Zhou, Jiheng [1 ]
Li, Xin [1 ]
Tang, Wenxian [3 ]
Liang, Yizeng [3 ]
机构
[1] Hunan Agr Univ, Coll Biosci & Biotechnol, Changsha, Hunan, Peoples R China
[2] Shanghai Tobacco Grp Co Ltd, Shanghai, Peoples R China
[3] Cent S Univ, Coll Chem & Chem Engn, Changsha, Hunan, Peoples R China
关键词
Variable selection; Partial least squares; Variable permutation population analysis; Model population analysis; Exponentially decreasing function; Multivariate spectral calibration; WAVELENGTH INTERVAL SELECTION; LEAST-SQUARES REGRESSION; GENETIC ALGORITHMS; RANDOM FROG; SPECTROSCOPY; ELIMINATION; PLS;
D O I
10.1016/j.chemolab.2016.08.006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Variable selection plays a pivotal role in the quantitative analysis of near-infrared (NIR) spectra with large number of variables and relatively few samples. In this study, a novel algorithm, namely variable permutation population analysis (VPPA) which combines variable permutation, model population analysis (MPA) and exponentially decreasing function (EDF), was proposed for variable selection to improve the prediction performance in multivariate spectral calibration. This method builds a large number of sub-datasets by Monte Carlo sampling (MCS) strategy in both sample space and variable space firstly, and the importance of each variable is subsequently evaluated using the difference value order of the corresponding partial least squares (PLS) model prediction error before and after the variable permutation. Next, EDF is applied to eliminate the relatively uninformative variables by force. Ultimately, cross validation is utilized to choose the optimal variable subset. A complete methodology for variable selection is constructed through the above four procedures. Three near infrared (NIR) datasets were presented to illustrate the proposed method and evaluate its performance. While PLS is used as the modeling method, the results reveal that VPPA is a potential variable selection method which shows better prediction performance when compared with conventional PLS, subwindow permutation analysis PIS (SPA-PLS), Monte Carlo uninformative variable elimination by PLS (MC-UVE-PLS), competitive adaptive reweighted sampling PLS (CARS-PLS) and genetic algorithm PLS (GA- PIS). Moreover, the proposed approach employs fewer variables than these variable optimization methods mentioned above. Therefore, the VPPA technique can be recommended for practical implementation in multivariate calibration of NIR spectra. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 13
页数:13
相关论文
共 50 条
  • [21] Variable Selection Method in the NIR Quantitative Analysis Model of Total Saponins in Red Ginseng Extract
    An Si-yu
    Zhang Lei
    Shang Xian-zhao
    Yue Hong-shui
    Liu Wen-yuan
    Ju Ai-chun
    SPECTROSCOPY AND SPECTRAL ANALYSIS, 2021, 41 (01) : 206 - 209
  • [22] An Efficient Variable Selection Method for Predictive Discriminant Analysis
    Iduseri A.
    Osemwenkhae J.E.
    Annals of Data Science, 2015, 2 (04) : 489 - 504
  • [23] Efficient variable screening for multivariate analysis
    Silva, APD
    JOURNAL OF MULTIVARIATE ANALYSIS, 2001, 76 (01) : 35 - 62
  • [24] A novel wavelength selection algorithm based on permutation analysis and variable combination
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 2665 - 2670
  • [25] Model validation by permutation tests: Applications to variable selection
    Lindgren, F
    Hansen, B
    Karcher, W
    Sjostrom, M
    Eriksson, L
    JOURNAL OF CHEMOMETRICS, 1996, 10 (5-6) : 521 - 532
  • [26] An overview of variable selection methods in multivariate analysis of near-infrared spectra
    Yun, Yong-Huan
    Li, Hong-Dong
    Deng, Bai-Chuan
    Cao, Dong-Sheng
    TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 2019, 113 : 102 - 115
  • [27] Random correlation in variable selection for multivariate calibration with a genetic algorithm
    JouanRimbaud, D
    Massart, DL
    deNoord, OE
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1996, 35 (02) : 213 - 220
  • [28] Predictive variable selection for the multivariate linear model
    Ibrahim, JG
    Chen, MH
    BIOMETRICS, 1997, 53 (02) : 465 - 478
  • [29] Predictive variable selection for the multivariate linear model
    Ibrahim, J. G.
    Chen, M.-H.
    Biometrics, 53 (02):
  • [30] Feature Variable Selection Based on VIS-NIR Spectra and Soil Moisture Content Prediction Model Construction
    Zhou, Nan
    Hong, Jin
    Song, Bo
    Wu, Shichao
    Wei, Yichen
    Wang, Tao
    JOURNAL OF SPECTROSCOPY, 2024, 2024