Understanding Out-of-distribution:A Perspective of Data Dynamics

被引:0
|
作者
Adila, Dyah [1 ]
Kang, Dongyeop [2 ]
机构
[1] Univ Wisconsin Madison, Dept Comp Sci, Madison, WI 53706 USA
[2] Univ Minnesota, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite machine learning models' success in Natural Language Processing (NLP) tasks, predictions from these models frequently fail on out-of-distribution (OOD) samples. Prior works have focused on developing state-of-the-art methods for detecting OOD. The fundamental question of how OOD samples differ from indistribution samples remains unanswered. This paper explores how data dynamics in training models can be used to understand the fundamental differences between OOD and in-distribution samples in extensive detail. We found that syntactic characteristics of the data samples that the model consistently predicts incorrectly in both OOD and in-distribution cases directly contradict each other. In addition, we observed preliminary evidence supporting the hypothesis that models are more likely to latch on trivial syntactic heuristics (e.g., overlap of words between two sentences) when making predictions on OOD samples. We hope our preliminary study accelerates the data-centric analysis on various machine learning phenomena.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
  • [31] DEEPLENS: Interactive Out-of-distribution Data Detection in NLP Models
    Song, Da
    Wang, Zhijie
    Huang, Yuheng
    Ma, Lei
    Zhang, Tianyi
    PROCEEDINGS OF THE 2023 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2023, 2023,
  • [32] Panoptic Out-of-Distribution Segmentation
    Mohan, Rohit
    Kumaraswamy, Kiran
    Hurtado, Juana Valeria
    Petek, Kursat
    Valada, Abhinav
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (05) : 4075 - 4082
  • [33] Certifiable Out-of-Distribution Generalization
    Ye, Nanyang
    Zhu, Lin
    Wang, Jia
    Zeng, Zhaoyu
    Shao, Jiayao
    Peng, Chensheng
    Pan, Bikang
    Li, Kaican
    Zhu, Jun
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10927 - 10935
  • [34] Entropic Out-of-Distribution Detection
    Macedo, David
    Ren, Tsang Ing
    Zanchettin, Cleber
    Oliveira, Adriano L., I
    Ludermir, Teresa
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [35] Watermarking for Out-of-distribution Detection
    Wang, Qizhou
    Liu, Feng
    Zhang, Yonggang
    Zhang, Jing
    Gong, Chen
    Liu, Tongliang
    Han, Bo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [36] Out-of-Distribution Detection by Cross-Class Vicinity Distribution of In-Distribution Data
    Zhao, Zhilin
    Cao, Longbing
    Lin, Kun-Yu
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 13777 - 13788
  • [37] Is Out-of-Distribution Detection Learnable?
    Fang, Zhen
    Li, Yixuan
    Lu, Jie
    Dong, Jiahua
    Han, Bo
    Liu, Feng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [38] On the Learnability of Out-of-distribution Detection
    Fang, Zhen
    Li, Yixuan
    Liu, Feng
    Han, Bo
    Lu, Jie
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [39] STEP : Out-of-Distribution Detection in the Presence of Limited In-distribution Labeled Data
    Zhou, Zhi
    Guo, Lan-Zhe
    Cheng, Zhanzhan
    Li, Yu-Feng
    Pu, Shiliang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [40] Reliable deep learning in anomalous diffusion against out-of-distribution dynamics
    Feng, Xiaochen
    Sha, Hao
    Zhang, Yongbing
    Su, Yaoquan
    Liu, Shuai
    Jiang, Yuan
    Hou, Shangguo
    Han, Sanyang
    Ji, Xiangyang
    NATURE COMPUTATIONAL SCIENCE, 2024, 4 (11): : 877 - 877