Towards Reliable Drift Detection and Explanation in Text Data

被引:0
|
作者
Feldhans, Robert [1 ]
Hammer, Barbara [1 ]
机构
[1] Bielefeld Univ, Bielefeld, Germany
关键词
Drift Explanation; Text Data; Transformer; Visualization;
D O I
10.1007/978-3-031-77731-8_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When delivered to the market, machine learning models face new data which are possibly subject to novel characteristics - a phenomenon known as concept drift. As this might lead to performance degradation, it is necessary to detect such drift and, if required, adapt the model accordingly. While a variety of drift detection and adaptation methods exists for standard vectorial data, a suitable treatment of text data is less researched. In this work we present a novel approach which detects and explains drift in text data based on their representation via transformer embeddings. In a nutshell, the method generates suitable statistical features from the original distribution and the possibly shifted variation. Based on these representations, drift scores can be assigned to individual data points, allowing a visualization and human-readable characterization of the type of drift. We demonstrate the approach's effectiveness in reliably detecting drift in several experiments.
引用
收藏
页码:301 / 312
页数:12
相关论文
共 50 条
  • [1] On the reliable detection of concept drift from streaming unlabeled data
    Sethi, Tegjyot Singh
    Kantardzic, Mehmed
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 82 : 77 - 99
  • [2] Online Detection and Infographic Explanation of Spam Reviews with Data Drift Adaptation
    De Arriba-Perez, Francisco
    Garcia-Mendez, Silvia
    Leal, Fatima
    Malheiro, Benedita
    Burguillo, Juan C.
    INFORMATICA, 2024, 35 (03) : 483 - 507
  • [3] Towards an understanding and explanation for mixed-initiative artificial scientific text detection
    Weng, Luoxuan
    Liu, Shi
    Zhu, Hang
    Sun, Jiashun
    Kam-Kwai, Wong
    Han, Dongming
    Zhu, Minfeng
    Chen, Wei
    INFORMATION VISUALIZATION, 2024, 23 (03) : 272 - 291
  • [4] Towards Online Concept Drift Detection with Feature Selection for Data Stream Classification
    Hammoodi, Mahmood
    Stahl, Frederic
    Tennant, Mark
    ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, 285 : 1549 - 1550
  • [5] Towards Unsupervised Sudden Data Drift Detection in Federated Learning with Fuzzy Clustering
    Stallmann, Morris
    Wilbik, Anna
    Weiss, Gerhard
    2024 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, FUZZ-IEEE 2024, 2024,
  • [6] Towards Explainable NLP: A Generative Explanation Framework for Text Classification
    Liu, Hui
    Yin, Qingyu
    Wang, William Yang
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5570 - 5581
  • [7] Detecting concept drift in data streams using model explanation
    Demsar, Jaka
    Bosnic, Zoran
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 92 : 546 - 559
  • [8] Detection and explanation of anomalies in healthcare data
    Durgesh Samariya
    Jiangang Ma
    Sunil Aryal
    Xiaohui Zhao
    Health Information Science and Systems, 11
  • [9] Detection and explanation of anomalies in healthcare data
    Samariya, Durgesh
    Ma, Jiangang
    Aryal, Sunil
    Zhao, Xiaohui
    HEALTH INFORMATION SCIENCE AND SYSTEMS, 2023, 11 (01)
  • [10] LayoutFormer: Hierarchical Text Detection Towards Scene Text Understanding
    Liang, Min
    Ma, Jia-Wei
    Zhu, Xiaobin
    Qin, Jingyan
    Yin, Xu-Cheng
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 15665 - 15674