Measuring the Novelty of Natural Language Text using the Conjunctive Clauses of a Tsetlin Machine Text Classifier

被引:10
|
作者
Bhattarai, Bimal [1 ]
Granmo, Ole-Christoffer [1 ]
Jiao, Lei [1 ]
机构
[1] Univ Agder, Dept Informat & Commun Technol, Grimstad, Norway
关键词
Novelty Detection; Deep Learning; Interpretable; Tsetlin Machine;
D O I
10.5220/0010382204100417
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most supervised text classification approaches assume a closed world, counting on all classes being present in the data at training time. This assumption can lead to unpredictable behaviour during operation, whenever novel, previously unseen, classes appear. Although deep learning-based methods have recently been used for novelty detection, they are challenging to interpret due to their black-box nature. This paper addresses interpretable open-world text classification, where the trained classifier must deal with novel classes during operation. To this end, we extend the recently introduced Tsetlin machine (TM) with a novelty scoring mechanism. The mechanism uses the conjunctive clauses of the TM to measure to what degree a text matches the classes covered by the training data. We demonstrate that the clauses provide a succinct interpretable description of known topics, and that our scoring mechanism makes it possible to discern novel topics from the known ones. Empirically, our TM-based approach outperforms seven other novelty detection schemes on three out of five datasets, and performs second and third best on the remaining, with the added benefit of an interpretable propositional logic-based representation.
引用
收藏
页码:410 / 417
页数:8
相关论文
共 50 条
  • [2] ConvTextTM: An Explainable Convolutional Tsetlin Machine Framework for Text Classification
    Bhattarai, Bimal
    Granmo, Ole-Christoffer
    Jiao, Lei
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3761 - 3770
  • [3] Text classification in natural language using Wikipedia
    Quinteiro-González, Jose María
    Martel-Jordán, Ernestina
    Hernández-Morera, Pablo
    Ligero-Fleitas, Juan A.
    López-Rodriguez, Aaron
    RISTI - Revista Iberica de Sistemas e Tecnologias de Informacao, 2011, (08): : 39 - 52
  • [4] A relational tsetlin machine with applications to natural language understanding
    Rupsa Saha
    Ole-Christoffer Granmo
    Vladimir I. Zadorozhny
    Morten Goodwin
    Journal of Intelligent Information Systems, 2022, 59 : 121 - 148
  • [5] A relational tsetlin machine with applications to natural language understanding
    Saha, Rupsa
    Granmo, Ole-Christoffer
    Zadorozhny, Vladimir, I
    Goodwin, Morten
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2022, 59 (01) : 121 - 148
  • [6] Using Tsetlin Machine to discover interpretable rules in natural language processing applications
    Saha, Rupsa
    Granmo, Ole-Christoffer
    Goodwin, Morten
    EXPERT SYSTEMS, 2023, 40 (04)
  • [7] Text Classification for Azerbaijani Language Using Machine Learning
    Suleymanov, Umid
    Kalejahi, Behnam Kiani
    Amrahov, Elkhan
    Badirkhanli, Rashid
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2020, 35 (06): : 467 - 475
  • [8] ACADEMIC TEXT CLUSTERING USING NATURAL LANGUAGE PROCESSING
    Taskiran, Salimkan Fatma
    Kaya, Ersin
    KONYA JOURNAL OF ENGINEERING SCIENCES, 2022, 10 : 41 - 51
  • [9] Using Alternate Representations of Text for Natural Language Understanding
    Varada, Venkata Sai
    Peris, Charith
    Park, Yangsook
    DiPersio, Christopher
    NLP FOR CONVERSATIONAL AI, 2020, : 1 - 10