DEEPLENS: Interactive Out-of-distribution Data Detection in NLP Models

被引:1
|
作者
Song, Da [1 ]
Wang, Zhijie [1 ]
Huang, Yuheng [1 ]
Ma, Lei [1 ,2 ]
Zhang, Tianyi [3 ]
机构
[1] Univ Alberta, Edmonton, AB, Canada
[2] Univ Tokyo, Tokyo, Japan
[3] Purdue Univ, W Lafayette, IN USA
基金
加拿大自然科学与工程研究理事会;
关键词
Interactive Visualization; Out-of-distribution Detection; Machine Learning; NLP;
D O I
10.1145/3544548.3580741
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine Learning (ML) has been widely used in Natural Language Processing (NLP) applications. A fundamental assumption in ML is that training data and real-world data should follow a similar distribution. However, a deployed ML model may suffer from out-of-distribution (OOD) issues due to distribution shifts in the real-world data. Though many algorithms have been proposed to detect OOD data from text corpora, there is still a lack of interactive tool support for ML developers. In this work, we propose DEEPLENS, an interactive system that helps users detect and explore OOD issues in massive text corpora. Users can efficiently explore different OOD types in DeepLens with the help of a text clustering method. Users can also dig into a specific text by inspecting salient words highlighted through neuron activation analysis. In a within-subjects user study with 24 participants, participants using DeepLens were able to find nearly twice more types of OOD issues accurately with 22% more confidence compared with a variant of DEEPLENS that has no interaction or visualization support.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] On Risk Assessment for Out-of-Distribution Detection
    Vasiliuk, Anton
    IEEE ACCESS, 2025, 13 : 18546 - 18568
  • [32] Investigation of out-of-distribution detection across various models and training methodologies
    Kim, Byung Chun
    Kim, Byungro
    Hyun, Yoonsuk
    NEURAL NETWORKS, 2024, 175
  • [33] Likelihood-free Out-of-Distribution Detection with Invertible Generative Models
    Ahmadian, Amirhossein
    Lindsten, Fredrik
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2119 - 2125
  • [34] Out-of-Distribution Detection by Cross-Class Vicinity Distribution of In-Distribution Data
    Zhao, Zhilin
    Cao, Longbing
    Lin, Kun-Yu
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 13777 - 13788
  • [35] Semantic enhanced for out-of-distribution detection
    Jiang, Weijie
    Yu, Yuanlong
    FRONTIERS IN NEUROROBOTICS, 2022, 16
  • [36] Unsupervised evaluation for out-of-distribution detection
    Zhang, Yuhang
    Hu, Jiani
    Wen, Dongchao
    Deng, Weihong
    PATTERN RECOGNITION, 2025, 160
  • [37] Likelihood Ratios for Out-of-Distribution Detection
    Ren, Jie
    Liu, Peter J.
    Fertig, Emily
    Snoek, Jasper
    Poplin, Ryan
    DePristo, Mark A.
    Dillon, Joshua V.
    Lakshminarayanan, Balaji
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [38] Generalized Out-of-Distribution Detection: A Survey
    Yang, Jingkang
    Zhou, Kaiyang
    Li, Yixuan
    Liu, Ziwei
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (12) : 5635 - 5662
  • [39] Semantically Coherent Out-of-Distribution Detection
    Yang, Jingkang
    Wang, Haoqi
    Feng, Litong
    Yan, Xiaopeng
    Zheng, Huabin
    Zhang, Wayne
    Liub, Ziwei
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8281 - 8289
  • [40] Fast Customization of Chemical Language Models to Out-of-Distribution Data Sets
    Toniato, Alessandra
    Vaucher, Alain C.
    Lehmann, Marzena Maria
    Luksch, Torsten
    Schwaller, Philippe
    Stenta, Marco
    Laino, Teodoro
    CHEMISTRY OF MATERIALS, 2023, 35 (21) : 8806 - 8815