Hybrid Feature Selection for Amharic News Document Classification

被引:4
|
作者
Endalie, Demeke [1 ]
Haile, Getamesay [1 ]
机构
[1] Jimma Inst Technol, Fac Comp & Informat, Jimma, Ethiopia
关键词
Text processing - Feature Selection - Information retrieval systems;
D O I
10.1155/2021/5516262
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Today, the amount of Amharic digital documents has grown rapidly. Because of this, automatic text classification is extremely important. Proper selection of features has a crucial role in the accuracy of classification and computational time. When the initial feature set is considerably larger, it is important to pick the right features. In this paper, we present a hybrid feature selection method, called IGCHIDF, which consists of information gain (IG), chi-square (CHI), and document frequency (DF) features' selection methods. We evaluate the proposed feature selection method on two datasets: dataset 1 containing 9 news categories and dataset 2 containing 13 news categories. Our experimental results showed that the proposed method performs better than other methods on both datasets land 2. The IGCHIDF method's classification accuracy is up to 3.96% higher than the IG method, up to 11.16% higher than CHI, and 7.3% higher than DF on dataset 2, respectively.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Hybrid Feature Selection and Ensemble Learning Methods for Gene Selection and Cancer Classification
    Qasem, Sultan Noman
    Saeed, Faisal
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (02) : 193 - 200
  • [42] A hybrid method of feature selection for Chinese text sentiment classification
    Wang, Suge
    Wei, Yingjie
    Li, Deyu
    Zhang, Wu
    Li, Wei
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 3, PROCEEDINGS, 2007, : 435 - +
  • [43] A hybrid feature selection algorithm for gene expression data classification
    Lu, Huijuan
    Chen, Junying
    Yan, Ke
    Jin, Qun
    Xue, Yu
    Gao, Zhigang
    NEUROCOMPUTING, 2017, 256 : 56 - 62
  • [44] An Evaluation on the Efficiency of Hybrid Feature Selection in Spam Email Classification
    Mohamad, Masurah
    Selamat, Ali
    2015 2ND INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATIONS, AND CONTROL TECHNOLOGY (I4CT), 2015,
  • [45] Hybrid Feature Selection Method for Improving File Fragment Classification
    Algurashi, Alia
    Wang, Wenjia
    ARTIFICIAL INTELLIGENCE XXXVI, 2019, 11927 : 379 - 391
  • [46] GAWA-A Feature Selection Method for Hybrid Sentiment Classification
    Rasool, Abdur
    Tao, Ran
    Kamyab, Marjan
    Hayat, Shoaib
    IEEE ACCESS, 2020, 8 : 191850 - 191861
  • [47] A Hybrid Feature Selection Algorithm For Classification Unbalanced Data Processsing
    Zhang, Xue
    Shi, Zhiguo
    Liu, Xuan
    Li, Xueni
    2018 IEEE INTERNATIONAL CONFERENCE ON SMART INTERNET OF THINGS (SMARTIOT 2018), 2018, : 269 - 275
  • [48] FEATURE EXTRACTION AND SELECTION HYBRID ALGORITHM FOR HYPERSPECTRAL IMAGERY CLASSIFICATION
    Jia, Sen
    Qian, Yuntao
    Li, Jiming
    Liu, Weixiang
    Ji, Zhen
    2010 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2010, : 72 - 75
  • [49] HYBRID FEATURE SELECTION FOR MYOELECTRIC SIGNAL CLASSIFICATION USING MICA
    Naik, Ganesh R.
    Kumar, Dinesh K.
    JOURNAL OF ELECTRICAL ENGINEERING-ELEKTROTECHNICKY CASOPIS, 2010, 61 (02): : 93 - 99
  • [50] Hybrid Feature Selection Methods for Online Biomedical Publication Classification
    Ma, Long
    Zhang, Yanqing
    Sunderraman, Raj
    Laird, Angela R.
    Fox, Peter T.
    Turner, Jessica A.
    Turner, Matthew D.
    2015 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (CIBCB), 2015, : 249 - 256