Measuring the Novelty of Natural Language Text using the Conjunctive Clauses of a Tsetlin Machine Text Classifier

被引:10
|
作者
Bhattarai, Bimal [1 ]
Granmo, Ole-Christoffer [1 ]
Jiao, Lei [1 ]
机构
[1] Univ Agder, Dept Informat & Commun Technol, Grimstad, Norway
关键词
Novelty Detection; Deep Learning; Interpretable; Tsetlin Machine;
D O I
10.5220/0010382204100417
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most supervised text classification approaches assume a closed world, counting on all classes being present in the data at training time. This assumption can lead to unpredictable behaviour during operation, whenever novel, previously unseen, classes appear. Although deep learning-based methods have recently been used for novelty detection, they are challenging to interpret due to their black-box nature. This paper addresses interpretable open-world text classification, where the trained classifier must deal with novel classes during operation. To this end, we extend the recently introduced Tsetlin machine (TM) with a novelty scoring mechanism. The mechanism uses the conjunctive clauses of the TM to measure to what degree a text matches the classes covered by the training data. We demonstrate that the clauses provide a succinct interpretable description of known topics, and that our scoring mechanism makes it possible to discern novel topics from the known ones. Empirically, our TM-based approach outperforms seven other novelty detection schemes on three out of five datasets, and performs second and third best on the remaining, with the added benefit of an interpretable propositional logic-based representation.
引用
收藏
页码:410 / 417
页数:8
相关论文
共 50 条
  • [21] Lightweight natural language text compression
    Nieves R. Brisaboa
    Antonio Fariña
    Gonzalo Navarro
    José R. Paramá
    Information Retrieval, 2007, 10 : 1 - 33
  • [22] THE MODEL FOR IDENTIFICATION OF A NATURAL LANGUAGE OF THE TEXT
    Gusev, S. V.
    Chepovskiy, A. M.
    BIZNES INFORMATIKA-BUSINESS INFORMATICS, 2011, 17 (03): : 31 - +
  • [23] SEARCHING NATURAL LANGUAGE TEXT BY COMPUTER
    SWANSON, DR
    SCIENCE, 1960, 132 (3434) : 1099 - 1104
  • [24] Generation of natural language text using perspective descriptor in frames
    Uma, GV
    Geetha, TV
    IETE JOURNAL OF RESEARCH, 2001, 47 (1-2) : 43 - 57
  • [25] Natural language watermarking using semantic substitution for Chinese text
    Chiang, YL
    Chang, LP
    Hsieh, WT
    Chen, WC
    DIGITAL WATERMARKING, 2004, 2939 : 129 - 140
  • [26] Implementation of Text Watermarking Technique Using Natural Language Watermarks
    Mali, Makarand L.
    Patil, Nitin N.
    Patil, J. B.
    2013 INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT 2013), 2013, : 482 - 486
  • [27] Speeding up Natural Language Text Search using Compression
    AbuSafiya, Majed
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (04) : 407 - 409
  • [28] Natural Language Processing of Radiology Text Reports: Interactive Text Classification
    Wiggins, Walter F.
    Kitamura, Felipe
    Santos, Igor
    Prevedello, Luciano M.
    RADIOLOGY-ARTIFICIAL INTELLIGENCE, 2021, 3 (04)
  • [29] Reflex Intellectual Text Processing Systems: Natural Language Text Addressing
    Lenkov, Serhii
    Kubyavka, Mykola
    Kubiavka, Liubov
    Lenkov, Yevhen
    Shevchuk, Valerii
    MOMLET&DS-2019: MODERN MACHINE LEARNING TECHNOLOGIES AND DATA SCIENCE, 2019, 2386 : 85 - 95
  • [30] Interpretable Text Classification in Legal Contract Documents using Tsetlin Machines
    Saha, Rupsa
    Jyhne, Sander
    2022 INTERNATIONAL SYMPOSIUM ON THE TSETLIN MACHINE (ISTM 2022), 2022, : 7 - 12