Text Classification Based on Naive Bayes Algorithm with Feature Selection

被引:0
|
作者
Chen, Zhenguo [1 ]
Shi, Guang [1 ]
Wang, Xiaoju [1 ]
机构
[1] N China Inst Sci & Technol, Dept Comp Sci & Technol, Beijing 101601, Peoples R China
基金
中国国家自然科学基金;
关键词
Text classification; Naive bayes; Feature selection;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Text Classification is the task to classify documents into predefined classes. It has become one of the key techniques for organizing information. Machine learning, a branch of artificial intelligence, has been used in text classification with better performance than rule based ones. But they mostly need lots of training samples in the processing, which not only brings heavy work for previous data collection, but also require a higher storage and computing resources during the processing. Naive Bayes is one of the most efficient and effective inductive learning algorithms and can get more accurate result in the large training sample set. To improve the performance, feature selection mechanisms are incorporated into naive bayes algorithm. Firstly, feature extraction techniques are applied to remove irrelevant and redundant features. After that, naive bayes classification algorithm is used to text classification. The experimental results have shown that this method keeps high classification accuracy.
引用
收藏
页码:4255 / 4260
页数:6
相关论文
共 50 条
  • [21] HYBRID FEATURE SELECTION APPROACH USING BACTERIAL FORAGING ALGORITHM GUIDED BY NAIVE BAYES CLASSIFICATION
    Mittal, Divya
    Bala, Manju
    2017 8TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2017,
  • [22] Naive Feature Selection: Sparsity in Naive Bayes
    Askari, Armin
    d'Aspremont, Alex
    El Ghaoui, Laurent
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 1813 - 1821
  • [23] Research on text classification mining based on Naive Bayes
    Liu, LZ
    Zhang, CL
    Chen, JJ
    ISTM/2005: 6TH INTERNATIONAL SYMPOSIUM ON TEST AND MEASUREMENT, VOLS 1-9, CONFERENCE PROCEEDINGS, 2005, : 8521 - 8524
  • [24] Research on Archives Text Classification Based on Naive Bayes
    Liu, Peixin
    Yu, Hongzhi
    Xu, Tao
    Lan, Chuanqo
    PROCEEDINGS OF 2017 IEEE 2ND INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 2017, : 187 - 190
  • [25] Weighted naive Bayes text classification algorithm based on improved distance correlation coefficient
    Ruan, Shufen
    Chen, Baozhou
    Song, Kunfang
    Li, Hongwei
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (04): : 2729 - 2738
  • [26] Firefly Algorithm based Feature Selection for Arabic Text Classification
    Marie-Sainte, Souad Larabi
    Alalyani, Nada
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2020, 32 (03) : 320 - 328
  • [27] An Improvement to Naive Bayes for Text Classification
    Zhang, Wei
    Gao, Feng
    CEIS 2011, 2011, 15
  • [28] Fast Feature Selection for Naive Bayes Classification in Data Stream Mining
    Lutu, Patricia E. N.
    WORLD CONGRESS ON ENGINEERING - WCE 2013, VOL III, 2013, : 1549 - 1554
  • [29] Text Classification on Mahout with Naive-Bayes Machine Learning Algorithm
    Salur, Mehmet Umut
    Tokat, Sezai
    Aydilek, Ibrahim Berkan
    2017 INTERNATIONAL ARTIFICIAL INTELLIGENCE AND DATA PROCESSING SYMPOSIUM (IDAP), 2017,
  • [30] An Improved Naive Bayes Text Classification Algorithm In Chinese Information Processing
    Yuan, Lingling
    THIRD INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND COMPUTATIONAL TECHNOLOGY (ISCSCT 2010), 2010, : 267 - 269