A predictive DEA model for outlier detection

被引:8
|
作者
Yang, Mingwen [1 ,2 ]
Wan, Guohua [1 ]
Zheng, Eric [2 ]
机构
[1] Shanghai Jiao Tong Univ, Antai Coll Econ & Management, Shanghai 200030, Peoples R China
[2] Univ Texas Dallas, Naveen Jindal Sch Management, Richardson, TX 75080 USA
关键词
predictive DEA; Bi-super DEA; outlier detection; simulation;
D O I
10.1080/23270012.2014.889911
中图分类号
F [经济];
学科分类号
02 ;
摘要
Outlier detection is one of the key issues in any data-driven analytics. In this paper, we propose Bi-super DEA, a super DEA-based method that constructs both efficient and inefficient frontiers for outlier detection. In evaluating its predictive performance, we develop a novel predictive DEA procedure, PDEA, which extends the conventional DEA approaches that have been primarily used for in-sample efficiency estimation, to predict outputs for the out-of-sample. This enables us to compare the predictive performance of our approach against several popular outlier detection methods including the parametric robust regression in statistics and non-parametric k-means in data mining. We conduct comprehensive simulation experiments to examine the relative performance of these outlier detection methods under the influence of five factors: sample size, linearity of production function, normality of noise distribution, homogeneity of data, and levels of random noise contaminating the data generating process (DGP). We find that, somewhat surprisingly, Bi-super CCR consistently outperforms Bi-super BCC in detecting outliers. Under the linearity, normality and homogeneity conditions, the parametric robust regression method works best. However, when the DGP violates these conditions, Bi-super DEA emerges as the better choice due to its distribution-free property. Our results shed light on the conditions that each method excels or fails and provide users with practical guidelines on how to choose appropriate methods to detect outliers.
引用
收藏
页码:20 / 41
页数:22
相关论文
共 50 条
  • [41] Outlier Detection in the Lognormal Logarithmic Conditional Autoregressive Range Model
    Chiang, Min-Hsien
    Chou, Ray Yeutien
    Wang, Li-Min
    OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 2016, 78 (01) : 126 - 144
  • [42] Automatic outlier detection based on PLS-MMD model
    Chen, Qian
    Xie, Wenyong
    Ye, Wangquan
    Duan, Xuejiao
    Li, Ying
    AOPC 2019: OPTICAL SPECTROSCOPY AND IMAGING, 2019, 11337
  • [43] Parametric model fitting: From inlier characterization to outlier detection
    Danuser, G
    Stricker, M
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (03) : 263 - 280
  • [44] GMDH-Based Outlier Detection Model in Classification Problems
    Ling Xie
    Yanlin Jia
    Jin Xiao
    Xin Gu
    Jing Huang
    Journal of Systems Science and Complexity, 2020, 33 : 1516 - 1532
  • [45] Model-based clustering and outlier detection with missing data
    Hung Tong
    Cristina Tortora
    Advances in Data Analysis and Classification, 2022, 16 : 5 - 30
  • [46] Model-Based Outlier Detection System with Statistical Preprocessing
    Singh, Asir Antony Gnana
    Leavline, Jebalamar
    JOURNAL OF MODERN APPLIED STATISTICAL METHODS, 2016, 15 (01) : 789 - 801
  • [47] Automated segmentation of multiple sclerosis lesions by model outlier detection
    Van Leemput, K
    Maes, F
    Vandermeulen, D
    Colchester, A
    Suetens, P
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2001, 20 (08) : 677 - 688
  • [48] Functional Outlier Detection
    Oguamalam, Jeremy
    Radojicic, Una
    Filzmoser, Peter
    COMBINING, MODELLING AND ANALYZING IMPRECISION, RANDOMNESS AND DEPENDENCE, SMPS 2024, 2024, 1458 : 325 - 333
  • [49] Neighborhood outlier detection
    Chen, Yumin
    Miao, Duoqian
    Zhang, Hongyun
    EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (12) : 8745 - 8749
  • [50] Outlier detection by example
    Cui Zhu
    Hiroyuki Kitagawa
    Spiros Papadimitriou
    Christos Faloutsos
    Journal of Intelligent Information Systems, 2011, 36 : 217 - 247