Practical approaches to principal component analysis for simultaneously dealing with missing and censored elements in chemical data

被引:7
|
作者
Stanimirova, I. [1 ]
机构
[1] Silesian Univ, Inst Chem, Dept Theoret Chem, PL-40006 Katowice, Poland
关键词
Left-censored data; Generalized nonlinear iterative partial least squares algorithm; Maximum likelihood principal component analysis; Expectation-maximization algorithm; Positive matrix factorization; POSITIVE MATRIX FACTORIZATION; MULTIVARIATE CURVE RESOLUTION; MAXIMUM-LIKELIHOOD; DATA SETS; DETECTION LIMIT; INCOMPLETE DATA; OUTLIERS; VALUES; NONDETECTS; REGRESSION;
D O I
10.1016/j.aca.2013.08.026
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Multivariate chemical data often contain elements that are missing completely at random and the so-called left-censored elements whose values are only known to be below a definite threshold value (reporting limit). In the last several years, attention has been paid to developing methods for dealing with data containing missing elements and those that can handle data with missing elements and outliers. However, processing data with both missing and left-censored elements is still an ongoing problem. The aim of this work was to investigate which method is most suitable for handling left-censored and missing completely at random elements that are present simultaneously in chemical data by using a comparison of the generalized nonlinear iterative partial least squares (NIPALS1) algorithm that has been recently proposed, methods that include uncertainty information like maximum likelihood principal component analysis, MLPCA2, and replacement methods. The results of the Monte Carlo simulation study for artificial and real data sets showed that substitution with half of the reporting limit can be used when the percentage of left-censored elements per variable is up to 30-40%. The generalized NIPALS algorithm is generally recommended for a large percentage of left-censored elements per variable and particularly when a large number of variables are censored. The expectation-maximization approach applied to data with censored elements substituted with half of the reporting limits can be a strategy for dealing with missing and left-censored elements in data, but if the converge criterion is not fulfilled, then the generalized NIPALS algorithm can be applied. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:27 / 37
页数:11
相关论文
共 50 条
  • [41] Comparisons among several methods for handling missing data in principal component analysis (PCA)
    Loisel, Sebastien
    Takane, Yoshio
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (02) : 495 - 518
  • [42] A practical sequential method for principal component analysis
    Wong, ASY
    Wong, KW
    Leung, CS
    NEURAL PROCESSING LETTERS, 2000, 11 (02) : 107 - 112
  • [43] Principal component analysis with interval imputed missing values
    Paola Zuccolotto
    AStA Advances in Statistical Analysis, 2012, 96 : 1 - 23
  • [44] CORRELATION OF CHEMICAL AND SENSORY DATA BY PRINCIPAL COMPONENT FACTOR-ANALYSIS
    MENG, AK
    BRENNER, L
    SUFFET, IH
    WATER SCIENCE AND TECHNOLOGY, 1992, 25 (02) : 49 - 56
  • [46] Principal Component Analysis of Process Datasets with Missing Values
    Severson, Kristen A.
    Molaro, Mark C.
    Braatz, Richard D.
    PROCESSES, 2017, 5 (03)
  • [47] A principal component method to impute missing values for mixed data
    Vincent Audigier
    François Husson
    Julie Josse
    Advances in Data Analysis and Classification, 2016, 10 : 5 - 26
  • [48] A principal component method to impute missing values for mixed data
    Audigier, Vincent
    Husson, Francois
    Josse, Julie
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2016, 10 (01) : 5 - 26
  • [49] New Approaches to Principal Component Analysis for Trees
    Aydın B.
    Pataki G.
    Wang H.
    Ladha A.
    Bullitt E.
    Marron J.S.
    Statistics in Biosciences, 2012, 4 (1) : 132 - 156
  • [50] Dealing with missing data in multi-informant studies: A comparison of approaches
    Chen, Po-Yi
    Jia, Fan
    Wu, Wei
    Wang, Min-Heng
    Chao, Tzi-Yang
    BEHAVIOR RESEARCH METHODS, 2024, 56 (07) : 6498 - 6519