Hybrid Feature Selection for Amharic News Document Classification

被引:4
|
作者
Endalie, Demeke [1 ]
Haile, Getamesay [1 ]
机构
[1] Jimma Inst Technol, Fac Comp & Informat, Jimma, Ethiopia
关键词
Text processing - Feature Selection - Information retrieval systems;
D O I
10.1155/2021/5516262
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Today, the amount of Amharic digital documents has grown rapidly. Because of this, automatic text classification is extremely important. Proper selection of features has a crucial role in the accuracy of classification and computational time. When the initial feature set is considerably larger, it is important to pick the right features. In this paper, we present a hybrid feature selection method, called IGCHIDF, which consists of information gain (IG), chi-square (CHI), and document frequency (DF) features' selection methods. We evaluate the proposed feature selection method on two datasets: dataset 1 containing 9 news categories and dataset 2 containing 13 news categories. Our experimental results showed that the proposed method performs better than other methods on both datasets land 2. The IGCHIDF method's classification accuracy is up to 3.96% higher than the IG method, up to 11.16% higher than CHI, and 7.3% higher than DF on dataset 2, respectively.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Novel Hybrid Feature Selection Models for Unsupervised Document Categorization
    Bhopale, Amol P.
    Kamath, Sowmya S.
    2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 1471 - 1477
  • [22] Multiscale feature extraction for time series classification with hybrid feature selection
    Zhang, Hui
    Lin, Mao-Song
    Huang, Wei
    Kawasaki, Saori
    Ho, Tu Bao
    INTELLIGENT CONTROL AND AUTOMATION, 2006, 344 : 939 - 944
  • [23] Wikipedia-Based Hybrid Document Representation for Textual News Classification
    Mourino Garcia, Marcos Antonio
    Perez Rodriguez, Roberto
    Anido Rifon, Luis
    Vilares Ferro, Manuel
    2016 3RD INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE (ISCMI 2016), 2016, : 148 - 153
  • [24] Amharic Fake News Detection on Social Media Using Feature Fusion
    Worku, Menbere Hailu
    Woldeyohannis, Michael Melese
    Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, 2022, 411 LNICST : 468 - 479
  • [25] Wikipedia-based hybrid document representation for textual news classification
    Marcos Antonio Mouriño-García
    Roberto Pérez-Rodríguez
    Luis Anido-Rifón
    Manuel Vilares-Ferro
    Soft Computing, 2018, 22 : 6047 - 6065
  • [26] Wikipedia-based hybrid document representation for textual news classification
    Antonio Mourino-Garcia, Marcos
    Perez-Rodriguez, Roberto
    Anido-Rifon, Luis
    Vilares-Ferro, Manuel
    SOFT COMPUTING, 2018, 22 (18) : 6047 - 6065
  • [27] A Novel Hybrid Feature Selection Algorithm for Hierarchical Classification
    Lima, Helen C. S. C.
    Otero, Fernando E. B.
    Merschmann, Luiz H. C.
    Souza, Marcone J. F.
    IEEE ACCESS, 2021, 9 : 127278 - 127292
  • [28] A Hybrid Feature Selection for MRI Brain Tumor Classification
    Kharrat, Ahmed
    Neji, Mahmoud
    INNOVATIONS IN BIO-INSPIRED COMPUTING AND APPLICATIONS, IBICA 2017, 2018, 735 : 329 - 338
  • [29] A Hybrid Feature Selection Method For Vietnamese Text Classification
    Nguyen Tri Hai
    Tuan Dinh Le
    Nguyen Hoang Nghia
    Vu Thanh Nguyen
    2015 SEVENTH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE), 2015, : 91 - 96
  • [30] Intelligent Hybrid Feature Selection for Textual Sentiment Classification
    Khan, Jawad
    Alam, Aftab
    Lee, Youngmoon
    IEEE ACCESS, 2021, 9 : 140590 - 140608