Z-Glyph: Visualizing outliers in multivariate data

被引:32
|
作者
Cao, Nan [1 ]
Lin, Yu-Ru [2 ]
Gotz, David [3 ]
Du, Fan [4 ]
机构
[1] Tongji Univ, Shanghai, Peoples R China
[2] Univ Pittsburgh, Pittsburgh, PA USA
[3] Univ N Carolina, Chapel Hill, NC USA
[4] Univ Maryland, College Pk, MD 20742 USA
基金
美国国家科学基金会;
关键词
Outlier detection; anomaly detection; information visualization; multidimensional data visualization; INTERACTIVE VISUALIZATION; INTRUSION; TAXONOMY; NUMBER;
D O I
10.1177/1473871616686635
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Outlier analysis techniques are extensively used in many domains such as intrusion detection. Today, even with the most advanced statistical learning techniques, human judgment still plays an important role in outlier analysis tasks due to the difficulty of defining and collecting outlier examples. This work seeks to tackle this problem by introducing a new visualization design, Z-Glyph, a family of glyphs designed to facilitate human judgment in outlier analysis of multivariate data. By employing a location-scale transformation, a Z-Glyph represents the normal data using regular shapes (e.g. straight line and circle), such that the abnormal data can be revealed when deviating from the regular shapes. Extensive controlled experiment and case studies based on real-world datasets indicate the superior performance of the Z-Glyph family, compared with the baselines, suggesting that the proposed design is able to leverage human perceptional features with statistical characterization. This study contributes to a more fundamental understanding about designing visual representations for revealing outliers in multivariate data, which can be applied as a building block in many domain-specific anomaly detection applications.
引用
收藏
页码:22 / 40
页数:19
相关论文
共 50 条
  • [41] Finding multivariate outliers in fMRI time-series data
    Magnotti, John F.
    Billor, Nedret
    COMPUTERS IN BIOLOGY AND MEDICINE, 2014, 53 : 115 - 124
  • [42] Detection of multivariate outliers in business survey data with incomplete information
    Valentin Todorov
    Matthias Templ
    Peter Filzmoser
    Advances in Data Analysis and Classification, 2011, 5 : 37 - 56
  • [44] VISUALIZING STRUCTURE IN HIGH-DIMENSIONAL MULTIVARIATE DATA
    YOUNG, FW
    RHEINGANS, P
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1991, 35 (1-2) : 97 - 107
  • [45] Lineage: Visualizing Multivariate Clinical Data in Genealogy Graphs
    Nobre, Carolina
    Gehlenborg, Nils
    Coon, Hilary
    Lex, Alexander
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2019, 25 (03) : 1543 - 1558
  • [46] Visualizing principal components analysis for multivariate process data
    Bisgaard, Soren
    Huang, Xuan
    JOURNAL OF QUALITY TECHNOLOGY, 2008, 40 (03) : 299 - 309
  • [47] A Graph Drawing Algorithm for Visualizing Multivariate Categorical Data
    HUANG Jingwei1
    2. School of Mathematics and Statistics
    Wuhan University Journal of Natural Sciences, 2007, (02) : 239 - 242
  • [48] Visualizing structure in high-dimensional multivariate data
    Young, F.W., 1600, (35): : 1 - 2
  • [49] An integrated exploration approach to visualizing multivariate particle data
    Jones, Chad
    Ma, Kwan-Liu
    Ethier, Stephane
    Lee, Wei-Li
    COMPUTING IN SCIENCE & ENGINEERING, 2008, 10 (04) : 20 - 29
  • [50] Depthgram: Visualizing outliers in high-dimensional functional data with application to fMRI data exploration
    Aleman-Gomez, Yasser
    Arribas-Gil, Ana
    Desco, Manuel
    Elias, Antonio
    Romo, Juan
    STATISTICS IN MEDICINE, 2022, 41 (11) : 2005 - 2024