FINDING CAUSES OF OUTLIERS IN MULTIVARIATE ENVIRONMENTAL DATA

被引:8
|
作者
GARNER, FC
STAPANIAN, MA
FITZGERALD, KE
机构
关键词
MULTIVARIATE KURTOSIS; GENERALIZED DISTANCE; MULTIVARIATE OUTLIERS;
D O I
10.1002/cem.1180050311
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multivariate outliers in environmental data sets are often caused by atypical measurement error in a single variable. From a quality assurance perspective it is important to identify these variables efficiently so that corrective actions may be performed. We demonstrate a procedure for using two multivariate tests to identify which variable 'caused' each outlier. The procedure is tested with simulated data sets that have the same correlation structure as selected water chemistry variables from a survey of lakes in the Western United States. The success rates are evaluated for three of the variables for sample sizes of 50 and 100, significance levels of 0.01 and 0.05 and various amounts of mean shift. The procedure works best for highly correlated variables.
引用
收藏
页码:241 / 248
页数:8
相关论文
共 50 条