Application of an Improved CHI Feature Selection Algorithm

被引:9
|
作者
Cai, Liang-jing [1 ]
Lv, Shu [1 ]
Shi, Kai-bo [2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Math Sci, Chengdu 611731, Sichuan, Peoples R China
[2] Chengdu Univ, Sch Elect Informat & Elect Engn, Chengdu 610106, Sichuan, Peoples R China
关键词
D O I
10.1155/2021/9963382
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Text classification is the critical content of machine learning, and it is widely applied in information filtering, sentimental analysis, and text review. It is very important to improve the accuracy of classification results, and this is also the main research purpose of researchers in this field in recent years. Feature selection plays an important role in text classification, which has the functions of eliminating irrelevant features, reducing dimensionality, and improving classification accuracy. So, this paper studies the CHI feature selection algorithm, and the main work and innovations are as follows: firstly, this paper analyzed the CHI algorithm's flaws, determined that the introduction of new parameters will be the improvement direction of the CHI algorithm, and thus proposed a new algorithm based on variance and coefficient of variation. Secondly, experiment to verify the effectiveness of the new algorithm. In terms of language, the experiment in this paper includes two text classification systems, which were Chinese and English. In terms of classifiers, two classifier algorithms were used, which included the KNN classifier and the Naive Bayes classifier. In terms of data types, two distribution types of data were used: balanced datasets and unbalanced datasets. Finally, experiment and result analysis. This paper has conducted 3 comparative experiments and analyzed the results of each experiment. The experimental results obtained are all significantly improved compared to the results before the improvement.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Application of Estimation of Distribution Algorithm for Feature Selection
    Ayodele, Mayowa
    PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCCO'19 COMPANION), 2019, : 43 - 44
  • [32] Improved feature selection algorithm based on SVM and correlation
    Xie, Zong-Xia
    Hu, Qing-Hua
    Yu, Da-Ren
    ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 1, 2006, 3971 : 1373 - 1380
  • [33] A Text Feature Selection Algorithm Based on Improved TFIDF
    Chengcheng Yang
    Xingshi He
    PROCEEDINGS OF THE 2008 CHINESE CONFERENCE ON PATTERN RECOGNITION (CCPR 2008), 2008, : 416 - 419
  • [34] A Fast Iterative Algorithm for Improved Unsupervised Feature Selection
    Ordozgoiti, Bruno
    Gomez Canaval, Sandra
    Mozo, Alberto
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, : 390 - 399
  • [35] Polarity Analysis Based on an Improved Feature Selection Algorithm
    Tian Weixin
    Zheng Sheng
    Wang Anhui
    2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL I, 2010, : 129 - 132
  • [36] An Improved Northern Goshawk Optimization Algorithm for Feature Selection
    Xie, Rongxiang
    Li, Shaobo
    Wu, Fengbin
    JOURNAL OF BIONIC ENGINEERING, 2024, 21 (04) : 2034 - 2072
  • [37] Enhancing the Diversity of Genetic Algorithm for Improved Feature Selection
    AlSukker, Akram
    Khushaba, Rami N.
    Al-Ani, Ahmed
    IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2010), 2010,
  • [38] Improved algorithm of Context Graph based on feature selection
    Liu, Wei
    Zhao, Jian
    Yang, Yongji
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2020, 20 (04) : 1043 - 1051
  • [39] Polarity Analysis Based on an Improved Feature Selection Algorithm
    Tian Weixin
    Zheng Sheng
    Wang Anhui
    APPLIED INFORMATICS AND COMMUNICATION, PT I, 2011, 224 : 207 - +
  • [40] IBJA: An improved binary DJaya algorithm for feature selection
    Abed-alguni, Bilal H.
    AL-Jarah, Saqer Hamzeh
    JOURNAL OF COMPUTATIONAL SCIENCE, 2024, 75