A Comparative Study on Feature Window Selection in Text Filtering

被引:0
|
作者
Hu Quan [1 ]
Xie Fang [2 ]
Liu Xiaoguang [3 ]
机构
[1] Huazhong Normal Univ, Coll Phys Sci & Technol, Wuhan 430079, Peoples R China
[2] Hubei Univ Technol, Coll Comp Sci, Wuhan 430068, Peoples R China
[3] Nankai Univ, Coll Informat Technol, Tianjin 300071, Peoples R China
关键词
text filtering; feature vector; feature window; matching algorithm;
D O I
10.1109/IFITA.2009.189
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text representation is a preliminary step to text filtering, while VSM is the most commonly used method in this field However, the document feature set, which produced by VSM, usually has a very high dimensionality. As a result, the distribution of feature value tends to be highly skewed In this paper some new mechanisms are presented to abate such problems. Using these mechanisms, document features are extracted from some smaller feature windows rather than a full text, such as sentences, graphs and blocks, and the correlative texts are finally evaluated by local similarity. They are gotten by the analysis of document's linguistics structures in documents. As a result, it can give a remarkable effect on the precision of text filtering.
引用
收藏
页码:209 / +
页数:2
相关论文
共 50 条
  • [41] A comparative study on the effect of feature selection on classification accuracy
    Karabulut, Esra Mahsereci
    Ozel, Selma Ayse
    Ibrikci, Turgay
    FIRST WORLD CONFERENCE ON INNOVATION AND COMPUTER SCIENCES (INSODE 2011), 2012, 1 : 323 - 327
  • [42] Comparative study of feature selection methods on microarray data
    Miyamoto, T
    Uchimura, S
    Hamamoto, Y
    Iizuka, N
    Oka, M
    Yamada-Okabe, H
    IEEE EMBS APBME 2003, 2003, : 82 - 83
  • [43] A Comparative Study of Feature Selection Methods for Biomarker Discovery
    Mungloo-Dilmohamud, Zahra
    Marigliano, Gary
    Jaufeerally-Fakim, Yasmina
    Pena-Reyes, Carlos
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 2789 - 2791
  • [44] A Comparative Study of Feature Selection Techniques for Intrusion Detection
    Kaur, Rajveer
    Kumar, Gulshan
    Kumar, Krishan
    2015 2ND INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2015, : 2120 - 2124
  • [45] Feature Selection in Mobile Activity Recognition: A Comparative Study
    Loddo, Andrea
    Pes, Barbara
    Riboni, Daniele
    2021 22ND IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2021), 2021, : 181 - 186
  • [46] Feature Selection in Software Defect Prediction: A Comparative Study
    Kakkar, Misha
    Jain, Sarika
    2016 6TH INTERNATIONAL CONFERENCE - CLOUD SYSTEM AND BIG DATA ENGINEERING (CONFLUENCE), 2016, : 658 - 663
  • [47] Contextual feature selection for text classification
    Paradis, Francois
    Nie, Jian-Yun
    INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (02) : 344 - 352
  • [48] A Comparative Study of Feature Selection Methods on Genomic Datasets
    Anaraki, Javad Rahimipour
    Usefi, Hamid
    2019 IEEE 32ND INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS), 2019, : 471 - 476
  • [49] A comparative study on feature selection methods for drug discovery
    Liu, Y
    JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (05): : 1823 - 1828
  • [50] Feature selection method on imbalanced text
    Liao, Yi-Xing
    Pan, Xue-Zeng
    Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2012, 41 (04): : 592 - 595