Estimating a one -class naive Bayes text classifier

被引:6
|
作者
Zhang, Yihong [1 ]
Jatowt, Adam [2 ]
机构
[1] Osaka Univ, Grad Sch Informat Sci & Technol, Dept Multimedia Engn, Osaka 5650871, Japan
[2] Kyoto Univ, Grad Sch Informat, Dept Social Informat, Kyoto 6068501, Japan
关键词
Machine learning; naive Bayes; one class classifier;
D O I
10.3233/IDA-194669
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays more and more information extraction projects need to classify large amounts of text data. The common way to classify text is to build a supervised classifier trained on human-labeled positive and negative examples. In many cases, however, it is easy to label positive examples, but hard to label negative examples. In this paper, we address the problem of building a one-class classifier when only the positive examples are labeled. Previous works on building one-class classifier mostly use positive examples and unlabeled data. In this paper, we show that a configurable one-class classifier such as one-class naive Bayes can be optimized by examining the clustering quality of the classification on target data. We propose to use existing and new quality scores for determining clustering quality of the classification. Experimental analysis with real-world data show that our approach generally achieves high classification accuracy, and in some cases improves the accuracy by more than 10% compared to state-of-art baselines. © 2020 - IOS Press and the authors. All rights reserved.
引用
收藏
页码:567 / 579
页数:13
相关论文
共 50 条
  • [11] Naive Bayes Classifier for depression detection using text data
    Samanvitha, S.
    Bindiya, A. R.
    Sudhanva, Shreya
    Mahanand, B. S.
    2021 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER TECHNOLOGIES AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2021, : 418 - 421
  • [12] Application of Human Cognitive Mechanisms to Naive Bayes Text Classifier
    Taniguchi, Hidetaka
    Sato, Hiroshi
    Shirakawa, Tomohiro
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NUMERICAL ANALYSIS AND APPLIED MATHEMATICS 2016 (ICNAAM-2016), 2017, 1863
  • [13] A Lexicon Pool Augmented Naive Bayes Classifier for Nepali Text
    Thakur, S. K.
    Singh, V. K.
    2014 SEVENTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2014, : 542 - 546
  • [14] Integrating incremental feature weighting into Naive Bayes text classifier
    Kim, Han Joon
    Chang, Jaeyoung
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 1137 - 1143
  • [15] Laplace Naive Bayes classifier in the classification of text in machine learning
    Kalcheva, Neli
    Nikolov, Nedyalko
    PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON BIOMEDICAL INNOVATIONS AND APPLICATIONS (BIA 2020), 2020, : 18 - 20
  • [16] Robust approach for estimating probabilities in naive-Bayes classifier
    Chandra, B.
    Gupta, Manish
    Gupta, M. P.
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2007, 4815 : 11 - +
  • [17] Advanced Naive Bayes Text Classifier with Embedded Feature Weighting Approach
    Kim, Han-joon
    INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2009, 12 (03): : 607 - 620
  • [18] Hierarchical Scheme for Assigning Components in Multinomial Naive Bayes Text Classifier
    Nghia Nguyen
    Yamada, Koichi
    Suzuki, Izumi
    Unehara, Muneyuki
    2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, : 335 - 340
  • [19] Using association features to enhance the performance of naive Bayes text classifier
    Zhang, Y
    Zhang, LJ
    Yan, JF
    Li, ZH
    ICCIMA 2003: FIFTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, PROCEEDINGS, 2003, : 336 - 341
  • [20] Chinese text classification by the Naive Bayes Classifier and the associative classifier with multiple confidence threshold values
    Lu, Shing-Hwa
    Chiang, Ding-An
    Keh, Huan-Chao
    Huang, Hui-Hua
    KNOWLEDGE-BASED SYSTEMS, 2010, 23 (06) : 598 - 604