Two feature weighting approaches for naive Bayes text classifiers

被引:79
|
作者
Zhang, Lungan [1 ]
Jiang, Liangxiao [1 ,2 ]
Li, Chaoqun [3 ]
Kong, Ganggang [1 ]
机构
[1] China Univ Geosci, Dept Comp Sci, Wuhan 430074, Peoples R China
[2] China Univ Geosci, Hubei Key Lab Intelligent Geoinformat Proc, Wuhan 430074, Peoples R China
[3] China Univ Geosci, Dept Math, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
Naive Bayes text classifiers; Feature weighting; Gain ratio; Decision tree;
D O I
10.1016/j.knosys.2016.02.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper works on feature weighting approaches for naive Bayes text classifiers. Almost all existing feature weighting approaches for naive Bayes text classifiers have some defects: limited improvement to classification performance of naive Bayes text classifiers or sacrificing the simplicity and execution time of the final models. In fact, feature weighting is not new for machine learning community, and many researchers have made fruitful efforts in the field of feature weighting. This paper reviews some simple and efficient feature weighting approaches designed for standard naive Bayes classifiers, and adapts them for naive Bayes text classifiers. As a result, this paper proposes two adaptive feature weighting approaches for naive Bayes text classifiers. Experimental results based on benchmark and real-world data show that, compared to their competitors, our feature weighting approaches show higher classification accuracy, yet at the same time maintain the simplicity and lower execution time of the final models. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:137 / 144
页数:8
相关论文
共 50 条
  • [21] Text Classification Based on Naive Bayes Algorithm with Feature Selection
    Chen, Zhenguo
    Shi, Guang
    Wang, Xiaoju
    INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2012, 15 (10): : 4255 - 4260
  • [22] Naive bayes text categorization using improved feature selection
    Lin, Kunhui
    Kang, Kai
    Huang, Yunping
    Zhou, Changle
    Wang, Beizhan
    Journal of Computational Information Systems, 2007, 3 (03): : 1159 - 1164
  • [23] Toward Optimal Feature Selection in Naive Bayes for Text Categorization
    Tang, Bo
    Kay, Steven
    He, Haibo
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (09) : 2508 - 2521
  • [24] Feature subset selection using naive Bayes for text classification
    Feng, Guozhong
    Guo, Jianhua
    Jing, Bing-Yi
    Sun, Tieli
    PATTERN RECOGNITION LETTERS, 2015, 65 : 109 - 115
  • [25] Incremental augmented naive Bayes classifiers
    Alcobé, JR
    ECAI 2004: 16TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 110 : 539 - 543
  • [26] Evolving extended naive bayes classifiers
    Klawonn, Frank
    Angelov, Plamen
    ICDM 2006: Sixth IEEE International Conference on Data Mining, Workshops, 2006, : 643 - 647
  • [27] Divergence-Based Feature Selection for Naive Bayes Text Classification
    Wang, Huizhen
    Zhu, Jingbo
    Su, Keh-Yih
    IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 209 - +
  • [28] Compensation strategy of unseen feature words in naive Bayes text classification
    School of Management, Harbin Institute of Technology, Harbin 150001, China
    不详
    Harbin Gongye Daxue Xuebao, 2008, 6 (956-960):
  • [29] A Visual Tool for Bayesian Data Analysis: The Impact of Smoothing on Naive Bayes Text Classifiers
    Di Nunzio, Giorgio Maria
    Sordoni, Alessandro
    SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2012, : 1002 - 1002
  • [30] Naive Bayes Text Categorization Algorithm Based on TF-IDF Attribute Weighting
    Jiang, Feng
    Zhang, Zhenghao
    Chen, Ping
    Liu, Yongrui
    PROCEEDINGS OF 2018 THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (CSAI 2018) / 2018 THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY (ICIMT 2018), 2018, : 521 - 525