Integrating Noun-Based Feature Ranking and Selection Methods with Arabic Text Associative Classification Approach

被引:0
|
作者
Abdullah S. Ghareb
Abdul Razak Hamdan
Azuraliza Abu Bakar
机构
[1] Universiti Kebangsaan Malaysia,Center for Artificial Intelligence Technology, Faculty of Information Science and Technology
关键词
Noun extraction; Feature ranking; Feature selection; Associative classification; Arabic text; Category association rule;
D O I
暂无
中图分类号
学科分类号
摘要
Feature ranking and selection (FR&S) is an important preprocessing phase for text classification, and it is in most cases produces small valuable sub-feature space among the whole feature space and reduces the classification errors. As the associative classification (AC) approach is an efficient method and its training and testing depend on the way that features ranked and selected, the examining of feature ranking methods is very significant. This paper presents an integration method of Arabic noun extraction with four FR&S methods: term frequency–inverse document frequency (TF-IDF), document frequency, odd ratio, and class discriminating measure (CDM). Association rule technology uses the result of the integrated feature selection to construct an Arabic text associative classifier. In this study, the majority voting and ordered decision list prediction methods are used by AC to assign test document to its category. A set of experiments are conducted on collection of Arabic text documents, and the experimental results show that our AC method works better with extracted nouns and feature selection method than with feature selection method individually. The AC based on CDM and TF-IDF methods outperforms the other methods in terms of AC accuracy. As the results indicate, the proposed method produces satisfactory classification accuracy and it has good selecting effect on the Arabic text associative classifier.
引用
收藏
页码:7807 / 7822
页数:15
相关论文
共 50 条
  • [32] Support vector machines based arabic language text classification system: Feature selection comparative study
    Mesleh, Abdelwadood
    APPLIED MATHEMATICS FOR SCIENCE AND ENGINEERING, 2007, : 228 - +
  • [33] Support Vector Machines Based Arabic Language Text Classification System: Feature Selection Comparative Study
    Mesleh, Abdelwadood Moh'd
    ADVANCES IN COMPUTER AND INFORMATIOM SCIENCES AND ENGINEERING, 2008, : 11 - 16
  • [34] Feature selection methods for text classification: a systematic literature review
    Pintas, Julliano Trindade
    Fernandes, Leandro A. F.
    Garcia, Ana Cristina Bicharra
    ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (08) : 6149 - 6200
  • [35] On Two-Stage Feature Selection Methods for Text Classification
    Uysal, Alper Kursat
    IEEE ACCESS, 2018, 6 : 43233 - 43251
  • [36] Feature selection methods for text classification: a systematic literature review
    Julliano Trindade Pintas
    Leandro A. F. Fernandes
    Ana Cristina Bicharra Garcia
    Artificial Intelligence Review, 2021, 54 : 6149 - 6200
  • [37] Comparing multiple categories of feature selection methods for text classification
    Zheng, Wanwan
    Jin, Mingzhe
    DIGITAL SCHOLARSHIP IN THE HUMANITIES, 2020, 35 (01) : 208 - 224
  • [38] Ensemble Filter-Wrapper Text Feature Selection Methods for Text Classification
    Ige, Oluwaseun Peter
    Gan, Keng Hoon
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2024, 141 (02): : 1847 - 1865
  • [39] Utility-based feature selection for text classification
    Heyong Wang
    Ming Hong
    Raymond Yiu Keung Lau
    Knowledge and Information Systems, 2019, 61 : 197 - 226
  • [40] Utility-based feature selection for text classification
    Wang, Heyong
    Hong, Ming
    Lau, Raymond Yiu Keung
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 61 (01) : 197 - 226