Comparison of Multivariate Outlier Detection Methods for Nearly Elliptical Distributions

被引:3
|
作者
Wada, Kazumi [1 ]
Kawano, Mariko [1 ]
Tsubaki, Hiroe [1 ]
机构
[1] Natl Stat Ctr NSTAC, Shinjyuku Ku, 19-1 Wakamatsu Cho, Tokyo 1628668, Japan
关键词
robust estimation; location and scatter; MSD; BACON; Fast-MCD; NNVE; R;
D O I
10.17713/ajs.v49i2.872
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, the performance of outlier detection methods has been evaluated with symmetrically distributed datasets. We choose four estimators, viz. modified Stahel-Donoho (MSD) estimators, blocked adaptive computationally efficient outlier nominators, minimum covariance determinant estimator obtained by a fast algorithm, and nearest-neighbour variance estimator, which are known for their good performance with elliptically distributed data, for practical applications in national survey data processing. We adopt the data model of multivariate skew-t distribution, of which only the direction of the main axis is skewed and contaminated with outliers following another probability distribution for evaluation. We conducted Monte Carlo simulation under the data distribution to compare the performance of outlier detection. We also explore the applicability of the selected methods for several accounting items in small and medium enterprise survey data. Accordingly, it was found that the MSD estimators are the most suitable.
引用
收藏
页码:1 / 17
页数:17
相关论文
共 50 条
  • [41] Stochastic approximation learning for mixtures of multivariate elliptical distributions
    Lopez-Rubio, Ezequiel
    NEUROCOMPUTING, 2011, 74 (17) : 2972 - 2984
  • [42] A general class of multivariate skew-elliptical distributions
    Branco, MD
    Dey, DK
    JOURNAL OF MULTIVARIATE ANALYSIS, 2001, 79 (01) : 99 - 113
  • [43] Intrinsic covariance matrix estimation for multivariate elliptical distributions
    Guo, Junhao
    Zhou, Jie
    Hu, Sanfeng
    STATISTICS & PROBABILITY LETTERS, 2020, 162
  • [44] Concomitants of order statistics from multivariate elliptical distributions
    Jamalizadeh, Ahad
    Balakrishnan, N.
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2012, 142 (02) : 397 - 409
  • [45] A note on Stein's lemma for multivariate elliptical distributions
    Landsman, Zinoviy
    Vanduffel, Steven
    Yao, Jing
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2013, 143 (11) : 2016 - 2022
  • [46] Multivariate functional outlier detection using the fast massive unsupervised outlier detection indices
    Ojo, Oluwasegun Taiwo
    Anta, Antonio Fernandez
    Genton, Marc G.
    Lillo, Rosa E.
    STAT, 2023, 12 (01):
  • [47] Study and implementation conditions of the multivariate outlier detection methods for screening of potential field failures
    Berges, Corinne
    Wu, Chunlei
    Soufflet, Pierre
    PROCEEDINGS OF THE 22ND INTERNATIONAL SYMPOSIUM ON THE PHYSICAL AND FAILURE ANALYSIS OF INTEGRATED CIRCUITS (IPFA 2015), 2015, : 167 - 172
  • [48] The largest nonidentifiable outlier: a comparison of multivariate simultaneous outlier identification rules
    Becker, C
    Gather, U
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2001, 36 (01) : 119 - 127
  • [49] Hit screening with multivariate robust outlier detection
    Leong, Hui Sun
    Zhang, Tianhui
    Corrigan, Adam
    Serrano, Alessia
    Kunzel, Ulrike
    Mullooly, Niamh
    Wiggins, Ceri
    Wang, Yinhai
    Novick, Steven
    PLOS ONE, 2024, 19 (09):
  • [50] Multivariate outlier detection and remediation in geochemical databases
    Lalor, GC
    Zhang, CS
    SCIENCE OF THE TOTAL ENVIRONMENT, 2001, 281 (1-3) : 99 - 109