Comparison of Gaussian process regression, partial least squares, random forest and support vector machines for a near infrared calibration of paracetamol samples

被引:4
|
作者
Sow, Aminata [1 ]
Traore, Issiaka [1 ]
Diallo, Tidiane [2 ,3 ]
Traore, Mohamed [4 ]
Ba, Abdramane [1 ]
机构
[1] Univ Sci Tech & Technol Bamako, Fac Sci & Tech FST, Lab Opt Spect & Sci Atmospher LOSSA, Bamako, Mali
[2] Univ Sci Tech & Technol Bamako, Fac Pharm, Dept Sci Medicament, Bamako, Mali
[3] Lab Natl Sante LNS, Bamako, Mali
[4] Ecole Natl Ingn Abderhamane Baba Toure, Bamako, Mali
关键词
Paracetamol; Near Infrared Spectroscopy; Data preprocessing; Nonlinear regression models; Linear regression techniques; COMPONENTS; TABLETS;
D O I
10.1016/j.rechem.2022.100508
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In this article, we analyze the near-infrared (NIR) spectra of fifty-eight (58) commercial tablets of 500 mg of paracetamol from different origins (that is, with different batch numbers) in the local markets in Bamako. The NIR spectra were recorded in the spectral range 930 nm-1700 nm. The samples are divided into forty-eight (48) samples forming the set of calibration (training set) and ten (10) samples used as the validation or test set. To perform multivariate calibration, we apply-three nonlinear regression techniques (Gaussian processes regression (GPR), Random Forest (RF), Support vector machine (KSVM)), along with the traditional linear partial leastsquares regression (PLSR) to several data pretreatments of the 58 samples. The results show that the three nonlinear regression calibrations have better prediction performance than PLS as far as RMSE is concerned. To decide the best regression model, we avoid R2 since this quantity is not a good parameter for this purpose. We will instead consider RMSE when comparing the different multivariate models. Additionally, to assess the impact of data preprocessing, we apply the above regression techniques to the original data, Multi-scattering correction (MSC), standard variate normalization (SNV) correction, smoothing correction, first derivative (FD), and second derivative correction (SD). The overall results reveal that Gaussian Processes Regression (GPR) applied to smooth correction gives the lowest RMSEP = 2.303053e-06 for validation (prediction) and RMSEC = 2.112316e-06 for calibration. In our investigation, one also notices that the developed GPR model is more accurate and exhibits enhanced behavior no matter which data preprocessing is used. All in all, GPR can be seen as an alternative powerful regression tool for NIR spectra of paracetamol samples. The statistical parameters of the proposed model are compared to the results of some other models reported in the literature.
引用
收藏
页数:7
相关论文
共 50 条
  • [41] Optimisation of partial least squares regression calibration models in near-infrared spectroscopy: a novel algorithm for wavelength selection
    Smith, MR
    Jee, RD
    Moffat, AC
    Rees, DR
    Broad, NW
    ANALYST, 2003, 128 (11) : 1312 - 1319
  • [42] Determination of Organic and Inorganic Carbon in Forest Soil Samples by Mid-Infrared Spectroscopy and Partial Least Squares Regression
    Tatzber, Michael
    Mutsch, Franz
    Mentler, Axel
    Leitgeb, Ernst
    Englisch, Michael
    Gerzabek, Martin H.
    APPLIED SPECTROSCOPY, 2010, 64 (10) : 1167 - 1175
  • [43] Physiological interference reduction for near infrared spectroscopy brain activity measurement based on recursive least squares adaptive filtering and least squares support vector machines
    Liu, Xin
    Zhang, Yan
    Liu, Dan
    Wang, Qisong
    Bai, Ou
    Sun, Jinwei
    Rolfe, Peter
    COMPUTER ASSISTED SURGERY, 2019, 24 : 160 - 166
  • [44] Comparison on quantitative inversion of characteristic ions in salinized soils with hyperspectral based on support vector regression and partial least squares regression
    Wang, Jingyi
    Li, Xiaoming
    EUROPEAN JOURNAL OF REMOTE SENSING, 2020, 53 (01) : 340 - 348
  • [45] Research on regional economy prediction based on partial least squares support vector regression
    Hongshan
    Ai, Junjun Shi
    International Journal of Applied Environmental Sciences, 2013, 8 (13): : 1645 - 1652
  • [46] A comparison of Gaussian process regression, random forests and support vector regression for burn severity assessment in diseased forests
    Hultquist, Carolynne
    Chen, Gang
    Zhao, Kaiguang
    REMOTE SENSING LETTERS, 2014, 5 (08) : 723 - 732
  • [47] Analysis of elements in wine using near infrared spectroscopy and partial least squares regression
    Cozzolino, D.
    Kwiatkowski, M. J.
    Dambergs, R. G.
    Cynkar, W. U.
    Janik, L. J.
    Skouroumounis, G.
    Gishen, A.
    TALANTA, 2008, 74 (04) : 711 - 716
  • [48] Modeling Pan Evaporation Using Gaussian Process Regression K-Nearest Neighbors Random Forest and Support Vector Machines; Comparative Analysis
    Shabani, Sevda
    Samadianfard, Saeed
    Sattari, Mohammad Taghi
    Mosavi, Amir
    Shamshirband, Shahaboddin
    Kmet, Tibor
    Varkonyi-Koczy, Annamaria R.
    ATMOSPHERE, 2020, 11 (01)
  • [49] Comparison of Bayesian regression models and partial least squares regression for the development of infrared prediction equations
    Bonfatti, V.
    Tiezzi, F.
    Miglior, F.
    Carnier, P.
    JOURNAL OF DAIRY SCIENCE, 2017, 100 (09) : 7306 - 7319
  • [50] Construction of global and robust near-infrared calibration models based on hybrid calibration sets using Partial Least Squares (PLS) regression
    Ni, Lijun
    Xiao, Lixia
    Yao, Heming
    Ge, Jiong
    Zhang, Liguo
    Luan, Shaorong
    ANALYTICAL LETTERS, 2019, 52 (07) : 1177 - 1194