Integrating Noun-Based Feature Ranking and Selection Methods with Arabic Text Associative Classification Approach

被引:3
|
作者
Ghareb, Abdullah S. [1 ]
Hamdan, Abdul Razak [1 ]
Abu Bakar, Azuraliza [1 ]
机构
[1] Univ Kebangsaan Malaysia, Fac Informat Sci & Technol, Ctr Artificial Intelligence Technol, Bangi 43600, Selangor, Malaysia
关键词
Noun extraction; Feature ranking; Feature selection; Associative classification; Arabic text; Category association rule;
D O I
10.1007/s13369-014-1304-3
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Feature ranking and selection (FR&S) is an important preprocessing phase for text classification, and it is in most cases produces small valuable sub-feature space among the whole feature space and reduces the classification errors. As the associative classification (AC) approach is an efficient method and its training and testing depend on the way that features ranked and selected, the examining of feature ranking methods is very significant. This paper presents an integration method of Arabic noun extraction with four FR&S methods: term frequency-inverse document frequency (TF-IDF), document frequency, odd ratio, and class discriminating measure (CDM). Association rule technology uses the result of the integrated feature selection to construct an Arabic text associative classifier. In this study, the majority voting and ordered decision list prediction methods are used by AC to assign test document to its category. A set of experiments are conducted on collection of Arabic text documents, and the experimental results show that our AC method works better with extracted nouns and feature selection method than with feature selection method individually. The AC based on CDM and TF-IDF methods outperforms the other methods in terms of AC accuracy. As the results indicate, the proposed method produces satisfactory classification accuracy and it has good selecting effect on the Arabic text associative classifier.
引用
收藏
页码:7807 / 7822
页数:16
相关论文
共 50 条
  • [21] Feature Selection by Using Heuristic Methods for Text Classification
    Sel, Ilhami
    Yeroglu, Celalettin
    Hanbay, Davut
    2019 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP 2019), 2019,
  • [22] Comparison of feature selection methods in Kurdish text classification
    Ari M. Saeed
    Soran Badawi
    Sara A. Ahmed
    Diyari A. Hassan
    Iran Journal of Computer Science, 2024, 7 (1) : 55 - 64
  • [23] Filter feature selection methods for text classification: a review
    Hong Ming
    Wang Heyong
    Multimedia Tools and Applications, 2024, 83 : 2053 - 2091
  • [24] Filter feature selection methods for text classification: a review
    Ming, Hong
    Heyong, Wang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (1) : 2053 - 2091
  • [25] An Experimental Study of Feature Selection Methods for Text Classification
    Uchyigit, Gulden
    Clark, Keith
    PERSONALIZATION TECHNIQUES AND RECOMMENDER SYSTEMS, 2008, : 303 - 320
  • [26] Integrating associative rule-based classification with Naive Bayes for text classification
    Hadi, Wa'el
    Al-Radaideh, Qasem A.
    Alhawari, Samer
    APPLIED SOFT COMPUTING, 2018, 69 : 344 - 356
  • [27] PSO-Based Feature Selection for Arabic Text Summarization
    Al-Zahrani, Ahmed M.
    Mathkour, Hassan
    Abdalla, Hassan
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2015, 21 (11) : 1454 - 1469
  • [28] Effective Text Classification by a Supervised Feature Selection Approach
    Basu, Tanmay
    Murthy, C. A.
    12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012), 2012, : 918 - 925
  • [29] Weirdness Coefficient as a Feature Selection Method for Arabic Special Domain Text Classification
    Al-Thubaity, AbdulMohsen
    Alanazi, Albandari
    Hazzaa, Itisam
    Al-Tuwaijri, Haya
    2012 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2012), 2012, : 69 - 72
  • [30] Feature selection using an improved Chi-square for Arabic text classification
    Bahassine, Said
    Madani, Abdellah
    Al-Sarem, Mohammed
    Kissi, Mohamed
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2020, 32 (02) : 225 - 231