Locally centred Mahalanobis distance: A new distance measure with salient features towards outlier detection

被引:51
|
作者
Todeschini, Roberto [1 ]
Ballabio, Davide [1 ]
Consonni, Viviana [1 ]
Sahigara, Faizan [1 ]
Filzmoser, Peter [2 ]
机构
[1] Univ Milano Bicocca, Dept Earth & Environm Sci, Milano Chemometr & QSAR Res Grp, I-20126 Milan, Italy
[2] Vienna Univ Technol, Dept Stat & Probabil Theory, A-1040 Vienna, Austria
关键词
Mahalanobis distance; Outlier detection; Similarity; Isolation degree; Remoteness; Covariance matrix; Data mining;
D O I
10.1016/j.aca.2013.04.034
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Outlier detection is a prerequisite to identify the presence of aberrant samples in a given set of data. The identification of such diverse data samples is significant particularly for multivariate data analysis where increasing data dimensionality can easily hinder the data exploration and such outliers often go undetected. This paper is aimed to introduce a novel Mahalanobis distance measure (namely, a pseudo-distance) termed as locally centred Mahalanobis distance, derived by centering the covariance matrix at each data sample rather than at the data centroid as in the classical covariance matrix. Two parameters, called as Remoteness and Isolation degree, were derived from the resulting pairwise distance matrix and their salient features facilitated a better identification of atypical samples isolated from the rest of the data, thus reflecting their potential application towards outlier detection. The Isolation degree demonstrated to be able to detect a new kind of outliers, that is, isolated samples within the data domain, thus resulting in a useful diagnostic tool to evaluate the reliability of predictions obtained by local models (e.g. k-NN models). To better understand the role of Remoteness and Isolation degree in identification of such aberrant data samples, some simulated and published data sets from literature were considered as case studies and the results were compared with those obtained by using Euclidean distance and classical Mahalanobis distance. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 9
页数:9
相关论文
共 50 条
  • [21] A new fault detection index based on Mahalanobis distance and kernel method
    Hajer Lahdhiri
    Okba Taouali
    Ilyes Elaissi
    Ines Jaffel
    Mohamed Faouzi Harakat
    Hassani Messaoud
    The International Journal of Advanced Manufacturing Technology, 2017, 91 : 2799 - 2809
  • [22] Anomaly detection for IGBTs using Mahalanobis distance
    Patil, Nishad
    Das, Diganta
    Pecht, Michael
    MICROELECTRONICS RELIABILITY, 2015, 55 (07) : 1054 - 1059
  • [23] Mahalanobis distance measurement based locally linear embedding algorithm
    Zhang, Xing-Fu
    Huang, Shao-Bin
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2012, 25 (02): : 318 - 324
  • [24] A new fault detection index based on Mahalanobis distance and kernel method
    Lahdhiri, Hajer
    Taouali, Okba
    Elaissi, Ilyes
    Jaffel, Ines
    Harakat, Mohamed Faouzi
    Messaoud, Hassani
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2017, 91 (5-8): : 2799 - 2809
  • [25] Mahalanobis distance similarity measure based distinguisher for template attack
    Zhang, Hailong
    Zhou, Yongbin
    Feng, Dengguo
    SECURITY AND COMMUNICATION NETWORKS, 2015, 8 (05) : 769 - 777
  • [26] An Approach to Online Fuzzy Clustering Based on the Mahalanobis Distance Measure
    Hu, Zhengbing
    Tyshchenko, Oleksii K.
    ADVANCES IN INTELLIGENT SYSTEMS, COMPUTER SCIENCE AND DIGITAL ECONOMICS, 2020, 1127 : 364 - 374
  • [27] MAHALANOBIS DISTANCE BASED ADVERSARIAL NETWORK FOR ANOMALY DETECTION
    Hou, Yubo
    Chen, Zhenghua
    Wu, Min
    Foo, Chuan-Sheng
    Li, Xiaoli
    Shubair, Raed M.
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3192 - 3196
  • [28] AC Arc Fault Detection Based on Mahalanobis Distance
    Cai Xiaochen
    Wang Li
    Sun Qiangang
    Meng Zhen
    2012 15TH INTERNATIONAL POWER ELECTRONICS AND MOTION CONTROL CONFERENCE (EPE/PEMC), 2012,
  • [29] Salient Region Detection Using Wasserstein Distance Measure Based on Nonlinear Scale Space
    Zhu, Lei
    Cao, Zhiguo
    MIPPR 2013: PATTERN RECOGNITION AND COMPUTER VISION, 2013, 8919
  • [30] Botnet traffic detection using RPCA and Mahalanobis Distance
    Vilaca, Eduardo S. C.
    Vieira, Thiago P. B.
    de Sousa, Rafael T.
    da Costa, Joao Paulo C. L.
    2019 WORKSHOP ON COMMUNICATION NETWORKS AND POWER SYSTEMS (WCNPS), 2019,