DEEPLENS: Interactive Out-of-distribution Data Detection in NLP Models

被引:1
|
作者
Song, Da [1 ]
Wang, Zhijie [1 ]
Huang, Yuheng [1 ]
Ma, Lei [1 ,2 ]
Zhang, Tianyi [3 ]
机构
[1] Univ Alberta, Edmonton, AB, Canada
[2] Univ Tokyo, Tokyo, Japan
[3] Purdue Univ, W Lafayette, IN USA
基金
加拿大自然科学与工程研究理事会;
关键词
Interactive Visualization; Out-of-distribution Detection; Machine Learning; NLP;
D O I
10.1145/3544548.3580741
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine Learning (ML) has been widely used in Natural Language Processing (NLP) applications. A fundamental assumption in ML is that training data and real-world data should follow a similar distribution. However, a deployed ML model may suffer from out-of-distribution (OOD) issues due to distribution shifts in the real-world data. Though many algorithms have been proposed to detect OOD data from text corpora, there is still a lack of interactive tool support for ML developers. In this work, we propose DEEPLENS, an interactive system that helps users detect and explore OOD issues in massive text corpora. Users can efficiently explore different OOD types in DeepLens with the help of a text clustering method. Users can also dig into a specific text by inspecting salient words highlighted through neuron activation analysis. In a within-subjects user study with 24 participants, participants using DeepLens were able to find nearly twice more types of OOD issues accurately with 22% more confidence compared with a variant of DEEPLENS that has no interaction or visualization support.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] An Efficient Data Augmentation Network for Out-of-Distribution Image Detection
    Lin, Cheng-Hung
    Lin, Cheng-Shian
    Chou, Po-Yung
    Hsu, Chen-Chien
    IEEE ACCESS, 2021, 9 : 35313 - 35323
  • [22] Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations
    Yuan, Lifan
    Chen, Yangyi
    Cui, Ganqu
    Gao, Hongcheng
    Zou, Fangyuan
    Cheng, Xingyi
    Ji, Heng
    Liu, Zhiyuan
    Sun, Maosong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [23] Out-of-Distribution Detection for Automotive Perception
    Nitsch, Julia
    Itkina, Masha
    Senanayake, Ransalu
    Nieto, Juan
    Schmidt, Max
    Siegwart, Roland
    Kochenderfer, Mykel J.
    Cadena, Cesar
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 2938 - 2943
  • [24] Decoupling MaxLogit for Out-of-Distribution Detection
    Zhang, Zihan
    Xiang, Xiang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 3388 - 3397
  • [25] Out-of-distribution Detection via Frequency-regularized Generative Models
    Cai, Mu
    Li, Yixuan
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5510 - 5519
  • [26] Robust Cough Detection With Out-of-Distribution Detection
    Chen, Yuhan
    Attri, Pankaj
    Barahona, Jeffrey
    Hernandez, Michelle L.
    Carpenter, Delesha
    Bozkurt, Alper
    Lobaton, Edgar
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (07) : 3210 - 3221
  • [27] STEP : Out-of-Distribution Detection in the Presence of Limited In-distribution Labeled Data
    Zhou, Zhi
    Guo, Lan-Zhe
    Cheng, Zhanzhan
    Li, Yu-Feng
    Pu, Shiliang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [28] Exploring the Limits of Out-of-Distribution Detection
    Fort, Stanislav
    Ren, Jie
    Lakshminarayanan, Balaji
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [29] An Empirical Evaluation of Out-of-Distribution Detection Using Pretrained Language Models
    Yoon, Byungmu
    Kim, Jaeyoung
    2023 5TH INTERNATIONAL CONFERENCE ON CONTROL AND ROBOTICS, ICCR, 2023, : 302 - 308
  • [30] Leveraging diffusion models for unsupervised out-of-distribution detection on image manifold
    Liu, Zhenzhen
    Zhou, Jin Peng
    Weinberger, Kilian Q.
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2024, 7