An approach for outlier and novelty detection for text data based on classifier confidence

被引:1
|
作者
Pizurica, Nikola [1 ]
Tomovic, Savo [1 ]
机构
[1] Univ Montenegro, Fac Math & Nat Sci, Cetinjska 2, Podgorica 81000, Montenegro
关键词
Classification; novelty detection; outlier detection; classifier confidence; information retrieval;
D O I
10.3233/AIC-200649
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present an approach for novelty detection in text data. The approach can also be considered as semi-supervised anomaly detection because it operates with the training dataset containing labelled instances for the known classes only. During the training phase the classification model is learned. It is assumed that at least two known classes exist in the available training dataset. In the testing phase instances are classified as normal or anomalous based on the classifier confidence. In other words, if the classifier cannot assign any of the known class labels to the given instance with sufficiently high confidence (probability), the instance will be declared as novelty (anomaly). We propose two procedures to objectively measure the classifier confidence. Experimental results show that the proposed approach is comparable to methods known in the literature.
引用
收藏
页码:139 / 153
页数:15
相关论文
共 50 条
  • [41] Novelty detection in patient histories: Experiments with measures based on text compression
    Edsberg, Ole
    Nytro, Oystein
    Rost, Thomas Brox
    ADVANCES IN INTELLIGENT DATA ANALYSIS VII, PROCEEDINGS, 2007, 4723 : 367 - +
  • [42] Local information based overlaid text detection by classifier fusion
    Ekin, Ahmet
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 2001 - 2004
  • [43] Measuring the Novelty of Natural Language Text using the Conjunctive Clauses of a Tsetlin Machine Text Classifier
    Bhattarai, Bimal
    Granmo, Ole-Christoffer
    Jiao, Lei
    ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2, 2021, : 410 - 417
  • [44] An Approach to Outlier Detection and Smoothing Applied to a Trajectography Radar Data
    Batista Junior, Aguinaldo Bezerra
    da Motta Pires, Paulo Sergio
    JOURNAL OF AEROSPACE TECHNOLOGY AND MANAGEMENT, 2014, 6 (03) : 237 - 248
  • [45] Unsupervised approach for online outlier detection in industrial process data
    Bechny, Michal
    Himmelbauer, Johannes
    3RD INTERNATIONAL CONFERENCE ON INDUSTRY 4.0 AND SMART MANUFACTURING, 2022, 200 : 257 - 266
  • [46] Outlier detection for multivariate time series: A functional data approach
    Lopez-Oriona, Angel
    Vilar, Jose A.
    KNOWLEDGE-BASED SYSTEMS, 2021, 233
  • [47] A practical outlier detection approach for mixed-attribute data
    Bouguessa, Mohamed
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (22) : 8637 - 8649
  • [48] Outlier detection for high dimensional data using the Comedian approach
    Sajesh, T. A.
    Srinivasan, M. R.
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2012, 82 (05) : 745 - 757
  • [49] An Adaptive Clustering Approach for Distributed Outlier Detection in Data Streams
    Della Monaca, Andrea
    Cafaro, Massimo
    Pulimeno, Marco
    Epicoco, Italo
    19TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE, 2023, 583 : 86 - 99
  • [50] A hybrid approach to outlier detection based on boundary region
    Jiang, Feng
    Sui, Yuefei
    Cao, Cungen
    PATTERN RECOGNITION LETTERS, 2011, 32 (14) : 1860 - 1870