Improved Mutual Information Method For Text Feature Selection

被引:0
|
作者
Ding Xiaoming [1 ]
Tang Yan [1 ]
机构
[1] Southwest Univ, Coll Comp & Informat Sci, Chongqing 400715, Peoples R China
关键词
text classification; feature selection; mutual information;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Reducing the dimensions of high-dimensional feature set is one of the difficulties of text categorization. Feature selection has been effectively applied in text classification, because of its low complexity of computing. Research works show that mutual information is a good feature selection method but doesn't consider the term frequency in each category of the corpus and the connections between terms. To remedying the defects of traditional mutual information method, this article improved measure of mutual information by introducing the feature frequency in class and the dispersion of feature in class, and built a experimental platform by constructing a Chinese text classification system, and did a multi-set of experiments base on this system. The results show that the new feature selection approach has a more excellent effect in text categorization.
引用
收藏
页码:163 / 166
页数:4
相关论文
共 50 条
  • [11] Spam Feature Selection Based on the Improved Mutual Information Algorithm
    Liang Ting
    Yu Qingsong
    2012 FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION NETWORKING AND SECURITY (MINES 2012), 2012, : 67 - 70
  • [12] Weighted average pointwise mutual information for feature selection in text categorization
    Schneider, KM
    KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2005, 2005, 3721 : 252 - 263
  • [13] Study on mutual information-based feature selection for text categorization
    Xu, Yan
    Jones, Gareth
    Li, Jintao
    Wang, Bin
    Sun, Chunming
    Journal of Computational Information Systems, 2007, 3 (03): : 1007 - 1012
  • [14] An improved feature transformation method using mutual information
    Bassir, Seyed
    Akbari, Ahmad
    Nassersharif, Babak
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2014, 17 (02) : 107 - 115
  • [15] An Improved Text Feature Selection Method for Transfer Learning
    Liu, Jiang
    Wang, Hao
    Liu, Jun
    CONTEMPORARY RESEARCH ON E-BUSINESS TECHNOLOGY AND STRATEGY, 2012, 332 : 600 - +
  • [16] An improved text feature selection method for transfer learning
    Liu, Jiang
    Wang, Hao
    Liu, Jun
    Communications in Computer and Information Science, 2013, 332 : 600 - 611
  • [17] Text Feature Selection Method in battlefield information service
    Wang Kai
    Liu Jingzhi
    Wang Kai
    Gan Zhichun
    Cai Yanjun
    2016 17TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES (PDCAT), 2016, : 216 - 220
  • [18] Modified Pointwise Mutual Information-Based Feature Selection for Text Classification
    Georgieva-Trifonova, Tsvetanka
    PROCEEDINGS OF THE FUTURE TECHNOLOGIES CONFERENCE (FTC) 2021, VOL 2, 2022, 359 : 333 - 353
  • [19] A novel feature selection method based on normalized mutual information
    La The Vinh
    Sungyoung Lee
    Young-Tack Park
    Brian J. d’Auriol
    Applied Intelligence, 2012, 37 : 100 - 120
  • [20] An Improved Feature Selection Algorithm with Conditional Mutual Information for Classification Problems
    Palanichamy, Jaganathan
    Ramasamy, Kuppuchamy
    2013 INTERNATIONAL CONFERENCE ON HUMAN COMPUTER INTERACTIONS (ICHCI), 2013,