Measuring the Novelty of Natural Language Text using the Conjunctive Clauses of a Tsetlin Machine Text Classifier

被引:10
|
作者
Bhattarai, Bimal [1 ]
Granmo, Ole-Christoffer [1 ]
Jiao, Lei [1 ]
机构
[1] Univ Agder, Dept Informat & Commun Technol, Grimstad, Norway
关键词
Novelty Detection; Deep Learning; Interpretable; Tsetlin Machine;
D O I
10.5220/0010382204100417
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most supervised text classification approaches assume a closed world, counting on all classes being present in the data at training time. This assumption can lead to unpredictable behaviour during operation, whenever novel, previously unseen, classes appear. Although deep learning-based methods have recently been used for novelty detection, they are challenging to interpret due to their black-box nature. This paper addresses interpretable open-world text classification, where the trained classifier must deal with novel classes during operation. To this end, we extend the recently introduced Tsetlin machine (TM) with a novelty scoring mechanism. The mechanism uses the conjunctive clauses of the TM to measure to what degree a text matches the classes covered by the training data. We demonstrate that the clauses provide a succinct interpretable description of known topics, and that our scoring mechanism makes it possible to discern novel topics from the known ones. Empirically, our TM-based approach outperforms seven other novelty detection schemes on three out of five datasets, and performs second and third best on the remaining, with the added benefit of an interpretable propositional logic-based representation.
引用
收藏
页码:410 / 417
页数:8
相关论文
共 50 条
  • [41] Language and Chronology: Text Dating by Machine Learning
    Tagliaferri, Lisa
    RENAISSANCE QUARTERLY, 2021, 74 (04) : 1304 - 1305
  • [42] Classification and Optimization Scheme for Text Data using Machine Learning Naive Bayes Classifier
    Venkatesh
    Ranjitha, K., V
    PROCEEDINGS OF 2018 IEEE WORLD SYMPOSIUM ON COMMUNICATION ENGINEERING (WSCE), 2018, : 33 - 36
  • [43] Laplace Naive Bayes classifier in the classification of text in machine learning
    Kalcheva, Neli
    Nikolov, Nedyalko
    PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON BIOMEDICAL INNOVATIONS AND APPLICATIONS (BIA 2020), 2020, : 18 - 20
  • [46] TEXT GENERATION - USING DISCOURSE STRATEGIES AND FOCUS CONSTRAINTS TO GENERATE NATURAL-LANGUAGE TEXT - MCKEOWN,KR
    WIESEMANN, U
    ZEITSCHRIFT FUR ANGLISTIK UND AMERIKANISTIK, 1989, 37 (01): : 81 - 82
  • [47] Emoji, Text, and Sentiment Polarity Detection Using Natural Language Processing
    Gupta, Shelley
    Singh, Archana
    Kumar, Vivek
    INFORMATION, 2023, 14 (04)
  • [48] A scoping review of empathy recognition in text using natural language processing
    Shetty, Vishal Anand
    Durbin, Shauna
    Weyrich, Meghan S.
    Martinez, Airin Denise
    Qian, Jing
    Chin, David L.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (03) : 762 - 775
  • [49] Developing Linguistic Constructs of Text Readability Using Natural Language Processing
    Crossley, Scott A.
    SCIENTIFIC STUDIES OF READING, 2025, 29 (02) : 138 - 160
  • [50] Analysis of Stock Market using Text Mining and Natural Language Processing
    Abdullah, Sheikh Shaugat
    Rahaman, Mohammad Saiedur
    Rahman, Mohammad Saidur
    2013 INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV), 2013,