Towards Reliable Drift Detection and Explanation in Text Data

被引：0

作者：

Feldhans, Robert ^{[1
]}

Hammer, Barbara ^{[1
]}

机构：

[1] Bielefeld Univ, Bielefeld, Germany

来源：

INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2024, PT I | 2025年 / 15346卷

关键词：

Drift Explanation; Text Data; Transformer; Visualization;

D O I：

10.1007/978-3-031-77731-8_28

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

When delivered to the market, machine learning models face new data which are possibly subject to novel characteristics - a phenomenon known as concept drift. As this might lead to performance degradation, it is necessary to detect such drift and, if required, adapt the model accordingly. While a variety of drift detection and adaptation methods exists for standard vectorial data, a suitable treatment of text data is less researched. In this work we present a novel approach which detects and explains drift in text data based on their representation via transformer embeddings. In a nutshell, the method generates suitable statistical features from the original distribution and the possibly shifted variation. Based on these representations, drift scores can be assigned to individual data points, allowing a visualization and human-readable characterization of the type of drift. We demonstrate the approach's effectiveness in reliably detecting drift in several experiments.

引用

页码：301 / 312

页数：12

共 50 条

[1] On the reliable detection of concept drift from streaming unlabeled data
Sethi, Tegjyot Singh
Kantardzic, Mehmed
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 82 : 77 - 99
[2] Online Detection and Infographic Explanation of Spam Reviews with Data Drift Adaptation
De Arriba-Perez, Francisco
Garcia-Mendez, Silvia
Leal, Fatima
Malheiro, Benedita
Burguillo, Juan C.
INFORMATICA, 2024, 35 (03) : 483 - 507
[3] Towards an understanding and explanation for mixed-initiative artificial scientific text detection
Weng, Luoxuan
Liu, Shi
Zhu, Hang
Sun, Jiashun
Kam-Kwai, Wong
Han, Dongming
Zhu, Minfeng
Chen, Wei
INFORMATION VISUALIZATION, 2024, 23 (03) : 272 - 291
[4] Towards Online Concept Drift Detection with Feature Selection for Data Stream Classification
Hammoodi, Mahmood
Stahl, Frederic
Tennant, Mark
ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, 285 : 1549 - 1550
[5] Towards Unsupervised Sudden Data Drift Detection in Federated Learning with Fuzzy Clustering
Stallmann, Morris
Wilbik, Anna
Weiss, Gerhard
2024 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, FUZZ-IEEE 2024, 2024,
[6] Towards Explainable NLP: A Generative Explanation Framework for Text Classification
Liu, Hui
Yin, Qingyu
Wang, William Yang
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5570 - 5581
[7] Detecting concept drift in data streams using model explanation
Demsar, Jaka
Bosnic, Zoran
EXPERT SYSTEMS WITH APPLICATIONS, 2018, 92 : 546 - 559
[8] Detection and explanation of anomalies in healthcare data
Durgesh Samariya
Jiangang Ma
Sunil Aryal
Xiaohui Zhao
Health Information Science and Systems, 11
[9] Detection and explanation of anomalies in healthcare data
Samariya, Durgesh
Ma, Jiangang
Aryal, Sunil
Zhao, Xiaohui
HEALTH INFORMATION SCIENCE AND SYSTEMS, 2023, 11 (01)
[10] LayoutFormer: Hierarchical Text Detection Towards Scene Text Understanding
Liang, Min
Ma, Jia-Wei
Zhu, Xiaobin
Qin, Jingyan
Yin, Xu-Cheng
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 15665 - 15674

← 1 2 3 4 5 →