A Comparative Study on Feature Window Selection in Text Filtering

被引:0
|
作者
Hu Quan [1 ]
Xie Fang [2 ]
Liu Xiaoguang [3 ]
机构
[1] Huazhong Normal Univ, Coll Phys Sci & Technol, Wuhan 430079, Peoples R China
[2] Hubei Univ Technol, Coll Comp Sci, Wuhan 430068, Peoples R China
[3] Nankai Univ, Coll Informat Technol, Tianjin 300071, Peoples R China
关键词
text filtering; feature vector; feature window; matching algorithm;
D O I
10.1109/IFITA.2009.189
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text representation is a preliminary step to text filtering, while VSM is the most commonly used method in this field However, the document feature set, which produced by VSM, usually has a very high dimensionality. As a result, the distribution of feature value tends to be highly skewed In this paper some new mechanisms are presented to abate such problems. Using these mechanisms, document features are extracted from some smaller feature windows rather than a full text, such as sentences, graphs and blocks, and the correlative texts are finally evaluated by local similarity. They are gotten by the analysis of document's linguistics structures in documents. As a result, it can give a remarkable effect on the precision of text filtering.
引用
收藏
页码:209 / +
页数:2
相关论文
共 50 条
  • [31] Feature Selection in Text Classification
    Sahin, Durmus Ozkan
    Ates, Nurullah
    Kilic, Erdal
    2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 1777 - 1780
  • [32] Filter methods for feature selection -: A comparative study
    Sanchez-Marono, Noelia
    Alonso-Betanzos, Amparo
    Tombilla-Sanroman, Marfa
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2007, 2007, 4881 : 178 - 187
  • [33] An Empirical Study of Category Skew on Feature Selection for Text Categorization
    Simeon, Mondelle
    Hilderman, Robert
    ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2009, 5549 : 249 - +
  • [34] Study and Analyze on Feature Selection in Text Categorization for Engineering Domain
    Wu Junyun
    EMERGING MATERIALS AND MECHANICS APPLICATIONS, 2012, 487 : 383 - 386
  • [35] An extensive empirical study of feature selection metrics for text classification
    Forman, George
    Journal of Machine Learning Research, 2003, 3 : 1289 - 1305
  • [36] Impact of feature selection techniques in Text Classification: An Experimental study
    Basha, S. Rahamat
    Rani, J. Keziya
    Yadav, J. J. C. Prasad
    Kumar, G. Ravi
    JOURNAL OF MECHANICS OF CONTINUA AND MATHEMATICAL SCIENCES, 2019, : 39 - 51
  • [37] Arabic Text Classification: A Review Study on Feature Selection Methods
    Hijazi, Musab Mustafa
    Zeki, Akram
    Ismail, Amelia
    2021 22ND INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2021, : 554 - 559
  • [38] A Comparative Study of Redundant Feature Detection based Feature Selection Methods
    Zeng, Xue-Qiang
    Chen, Qian-Sheng
    2014 INTERNATIONAL CONFERENCE ON COMPUTER, INFORMATION AND TELECOMMUNICATION SYSTEMS (CITS), 2014,
  • [39] Feature Representations for Scene Text Character Recognition: A Comparative Study
    Yi, Chucai
    Yang, Xiaodong
    Tian, Yingli
    2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 907 - 911
  • [40] A Review on Feature Selection and Feature Extraction for Text Classification
    Shah, Foram P.
    Patel, Vibha
    PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2016, : 2264 - 2268