A Comparative Study on Feature Window Selection in Text Filtering

被引:0
|
作者
Hu Quan [1 ]
Xie Fang [2 ]
Liu Xiaoguang [3 ]
机构
[1] Huazhong Normal Univ, Coll Phys Sci & Technol, Wuhan 430079, Peoples R China
[2] Hubei Univ Technol, Coll Comp Sci, Wuhan 430068, Peoples R China
[3] Nankai Univ, Coll Informat Technol, Tianjin 300071, Peoples R China
关键词
text filtering; feature vector; feature window; matching algorithm;
D O I
10.1109/IFITA.2009.189
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text representation is a preliminary step to text filtering, while VSM is the most commonly used method in this field However, the document feature set, which produced by VSM, usually has a very high dimensionality. As a result, the distribution of feature value tends to be highly skewed In this paper some new mechanisms are presented to abate such problems. Using these mechanisms, document features are extracted from some smaller feature windows rather than a full text, such as sentences, graphs and blocks, and the correlative texts are finally evaluated by local similarity. They are gotten by the analysis of document's linguistics structures in documents. As a result, it can give a remarkable effect on the precision of text filtering.
引用
收藏
页码:209 / +
页数:2
相关论文
共 50 条
  • [1] A comparative study on feature window selection in text filtering
    He, Tingting
    Xu, Xiaoqi
    2005 International Symposium on Computer Science and Technology, Proceedings, 2005, : 306 - 314
  • [2] A feature selection framework for text filtering
    Zheng, ZH
    Srihari, R
    Srihari, S
    THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2003, : 705 - 708
  • [3] A Comparative Study on Feature Selection in Unbalance Text Classification
    Xu, Yan
    2012 INTERNATIONAL SYMPOSIUM ON INFORMATION SCIENCE AND ENGINEERING (ISISE), 2012, : 44 - 47
  • [4] A Comparative Study of Feature Selection for SVM in Video Text Detection
    Wang Zhen
    Wei Zhiqiang
    SECOND INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN, VOL 2, PROCEEDINGS, 2009, : 552 - 556
  • [5] A comparative study on unsupervised feature selection methods for text clustering
    Liu, LY
    Kang, JC
    Yu, J
    Wang, ZL
    PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 597 - 601
  • [6] COMPARATIVE STUDY OF FEATURE SELECTION APPROACHES FOR URDU TEXT CATEGORIZATION
    Zia, Tehseen
    Akhter, Muhammad Pervez
    Abbas, Qaiser
    MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2015, 28 (02) : 93 - 109
  • [7] A Comparative Study to Evaluate Filtering Methods for Crime Data Feature Selection
    Jalil, Masita Masila Abdul
    Mohd, Fatihah
    Noor, Noor Maizura Mohamad
    DISCOVERY AND INNOVATION OF COMPUTER SCIENCE TECHNOLOGY IN ARTIFICIAL INTELLIGENCE ERA, 2017, 116 : 113 - 120
  • [8] A comparative study of feature selection methods for binary text streams classification
    Matheus Bernardelli de Moraes
    Andre Leon Sampaio Gradvohl
    Evolving Systems, 2021, 12 : 997 - 1013
  • [9] Comparative Study of Feature Selection Methods for Medical Full Text Classification
    Adriano Goncalves, Carlos
    Lorenzo Iglesias, Eva
    Borrajo, Lourdes
    Camacho, Rui
    Seara Vieira, Adrian
    Goncalves, Celia Talma
    BIOINFORMATICS AND BIOMEDICAL ENGINEERING (IWBBIO 2019), PT II, 2019, 11466 : 550 - 560
  • [10] A comparative study on feature selection of text categorization for hidden Markov models
    Yi, K
    Beheshti, J
    CANADIAN JOURNAL OF INFORMATION AND LIBRARY SCIENCE-REVUE CANADIENNE DES SCIENCES DE L INFORMATION ET DE BIBLIOTHECONOMIE, 2004, 28 (03): : 101 - 101