Text Classification Based on Naive Bayes Algorithm with Feature Selection

被引:0
|
作者
Chen, Zhenguo [1 ]
Shi, Guang [1 ]
Wang, Xiaoju [1 ]
机构
[1] N China Inst Sci & Technol, Dept Comp Sci & Technol, Beijing 101601, Peoples R China
基金
中国国家自然科学基金;
关键词
Text classification; Naive bayes; Feature selection;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Text Classification is the task to classify documents into predefined classes. It has become one of the key techniques for organizing information. Machine learning, a branch of artificial intelligence, has been used in text classification with better performance than rule based ones. But they mostly need lots of training samples in the processing, which not only brings heavy work for previous data collection, but also require a higher storage and computing resources during the processing. Naive Bayes is one of the most efficient and effective inductive learning algorithms and can get more accurate result in the large training sample set. To improve the performance, feature selection mechanisms are incorporated into naive bayes algorithm. Firstly, feature extraction techniques are applied to remove irrelevant and redundant features. After that, naive bayes classification algorithm is used to text classification. The experimental results have shown that this method keeps high classification accuracy.
引用
收藏
页码:4255 / 4260
页数:6
相关论文
共 50 条
  • [31] Parallel naive Bayes algorithm for large-scale Chinese text classification based on spark
    Liu Peng
    Zhao Hui-han
    Teng Jia-yu
    Yang Yan-yan
    Liu Ya-feng
    Zhu Zong-wei
    JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2019, 26 (01) : 1 - 12
  • [32] Classification Algorithm for Naive Bayes Based on Validity and Correlation
    Dong, Huailin
    Zhu, Xiaodan
    Wu, Qingfeng
    Huang, Juanjuan
    SENSORS, MEASUREMENT AND INTELLIGENT MATERIALS, PTS 1-4, 2013, 303-306 : 1609 - 1612
  • [33] Text-Based Gender Classification of Twitter Data using Naive Bayes and SVM Algorithm
    Angeles, Angelic
    Quintos, Maria Nikki
    Octaviano, Manolito, Jr.
    Raga, Rodolofo, Jr.
    2021 IEEE REGION 10 CONFERENCE (TENCON 2021), 2021, : 522 - 526
  • [34] Improved feature size customized fast correlation-based filter for Naive Bayes text classification
    Zhang, Yun
    Zhang, Yude
    He, Wei
    Yu, Shujuan
    Zhao, Shengmei
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (03) : 3117 - 3127
  • [35] Variable selection for Naive Bayes classification
    Blanquero, Rafael
    Carrizosa, Emilio
    Ramirez-Cobo, Pepa
    Remedios Sillero-Denamiel, M.
    COMPUTERS & OPERATIONS RESEARCH, 2021, 135
  • [36] Naïve bayes text classification with statistical data feature selection
    Janaki Meena, M.
    Chandran, K.R.
    Advances in Modelling and Analysis B, 2009, 52 (1-2): : 83 - 99
  • [37] Text Sentiment Analysis Based on Improved Naive Bayes Algorithm
    Li, Xinfei
    Xie, Xiaolan
    Wang, Jiaming
    Tang, Yigang
    ARTIFICIAL INTELLIGENCE AND SECURITY, ICAIS 2022, PT I, 2022, 13338 : 513 - 523
  • [38] The Research Of Feature Selection Of Text Classification Based On Integrated Learning Algorithm
    Xia Huosong
    Liu Jian
    2011 TENTH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING AND SCIENCE (DCABES), 2011, : 20 - 22
  • [39] Feature selection algorithm for text classification based on improved mutual information
    丛帅
    张积宾
    徐志明
    王宇颖
    Journal of Harbin Institute of Technology(New series), 2011, (03) : 144 - 148
  • [40] Category Discrimination Based Feature Selection Algorithm in Chinese Text Classification
    Yi, Junkai
    Yang, Guang
    Wan, Jing
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2016, 32 (05) : 1145 - 1159