Text Classification Based on Naive Bayes Algorithm with Feature Selection

被引:0
|
作者
Chen, Zhenguo [1 ]
Shi, Guang [1 ]
Wang, Xiaoju [1 ]
机构
[1] N China Inst Sci & Technol, Dept Comp Sci & Technol, Beijing 101601, Peoples R China
基金
中国国家自然科学基金;
关键词
Text classification; Naive bayes; Feature selection;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Text Classification is the task to classify documents into predefined classes. It has become one of the key techniques for organizing information. Machine learning, a branch of artificial intelligence, has been used in text classification with better performance than rule based ones. But they mostly need lots of training samples in the processing, which not only brings heavy work for previous data collection, but also require a higher storage and computing resources during the processing. Naive Bayes is one of the most efficient and effective inductive learning algorithms and can get more accurate result in the large training sample set. To improve the performance, feature selection mechanisms are incorporated into naive bayes algorithm. Firstly, feature extraction techniques are applied to remove irrelevant and redundant features. After that, naive bayes classification algorithm is used to text classification. The experimental results have shown that this method keeps high classification accuracy.
引用
收藏
页码:4255 / 4260
页数:6
相关论文
共 50 条
  • [1] Feature selection for text classification with Naive Bayes
    Chen, Jingnian
    Huang, Houkuan
    Tian, Shengfeng
    Qu, Youli
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) : 5432 - 5435
  • [2] Divergence-Based Feature Selection for Naive Bayes Text Classification
    Wang, Huizhen
    Zhu, Jingbo
    Su, Keh-Yih
    IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 209 - +
  • [3] Feature subset selection using naive Bayes for text classification
    Feng, Guozhong
    Guo, Jianhua
    Jing, Bing-Yi
    Sun, Tieli
    PATTERN RECOGNITION LETTERS, 2015, 65 : 109 - 115
  • [4] Discrimination-based feature selection for multinomial naive Bayes text classification
    Zhu, Jingbo
    Wang, Huizhen
    Zhang, Xijuan
    COMPUTER PROCESSING OF ORIENTAL LANGUAGES, PROCEEDINGS: BEYOND THE ORIENT: THE RESEARCH CHALLENGES AHEAD, 2006, 4285 : 149 - +
  • [5] Feature Selection Based on Sampling and C4.5 Algorithm to Improve the Quality of Text Classification Using Naive Bayes
    Molano, Viviana
    Cobos, Carlos
    Mendoza, Martha
    Herrera-Viedma, Enrique
    Manic, Milos
    HUMAN-INSPIRED COMPUTING AND ITS APPLICATIONS, PT I, 2014, 8856 : 80 - 91
  • [6] A Chinese text classification system based on Naive Bayes algorithm
    Cui, Wei
    2016 INTERNATIONAL CONFERENCE ON ELECTRONIC, INFORMATION AND COMPUTER ENGINEERING, 2016, 44
  • [7] Feature selection for optimizing the Naive Bayes algorithm
    Winarti, Titin
    Vydia, Vensy
    ENGINEERING, INFORMATION AND AGRICULTURAL TECHNOLOGY IN THE GLOBAL DIGITAL REVOLUTION, 2020, : 47 - 51
  • [8] Chinese News Text Multi Classification Based on Naive Bayes Algorithm
    Wang, Fei
    Deng, Xin
    Hou, Lunqing
    ISCSIC'18: PROCEEDINGS OF THE 2ND INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND INTELLIGENT CONTROL, 2018,
  • [9] An improved FloatBoost algorithm for Naive Bayes text classification
    Liu, XM
    Yin, JW
    Dong, JX
    Ghafoor, MA
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 162 - 171
  • [10] A New Feature Selection Approach to Naive Bayes Text Classifiers
    Zhang, Lungan
    Jiang, Liangxiao
    Li, Chaoqun
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2016, 30 (02)