Adapting Naive Bayes Model for Text Classification with One-of and Imbalanced Multi-Class Problems

被引:0
|
作者
Almaleh, Ahood [1 ]
Aslam, Muhammad Ahtisham [1 ]
Saeedi, Kawther [1 ]
机构
[1] King Abdulaziz Univ, Fac Comp & Informat Technol, Jeddah 21589, Saudi Arabia
关键词
text classification; multi-class problems; text mining; machine learning;
D O I
10.22937/IJCSNS.2020.20.09.11
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Increasingly interested in research communities, the text classification area enables the text or part of the text to be classified into classes for extracting useful information. Expensive to scale, the manual classification tasks are becoming vulnerable to potential unreliability as documents in the world increase, especially if the classes number more than two (multiclass classification). As a classification technique based on algorithms, automatic classification facilitates the automatic categorization of text documents to classes, thus resulting in reliable and efficient classification. This paper aims to describe the process of using the Naive Bayes classifier for text classification with one-of and multiclass, especially in cases where the probability of imbalanced classes is higher. Our proposed process consists of a number of steps such as data preprocessing, classification model building, evaluating and predicting classes as final classification results.
引用
收藏
页码:84 / 90
页数:7
相关论文
共 50 条
  • [41] Parameter-free classification in multi-class imbalanced data sets
    Cerf, Loic
    Gay, Dominique
    Selmaoui-Folcher, Nazha
    Cremilleux, Bruno
    Boulicaut, Jean-Francois
    DATA & KNOWLEDGE ENGINEERING, 2013, 87 : 109 - 129
  • [42] Multi-class Imbalanced Data Oversampling for Vertebral Column Pathologies Classification
    Saez, Jose A.
    Quintian, Hector
    Krawczyk, Bartosz
    Wozniak, Michal
    Corchado, Emilio
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS (HAIS 2018), 2018, 10870 : 131 - 142
  • [43] Performance Analysis of Binarization Strategies for Multi-class Imbalanced Data Classification
    Zak, Michal
    Wozniak, Michal
    COMPUTATIONAL SCIENCE - ICCS 2020, PT IV, 2020, 12140 : 141 - 155
  • [44] An online ensemble classification algorithm for multi-class imbalanced data stream
    Han, Meng
    Li, Chunpeng
    Meng, Fanxing
    He, Feifei
    Zhang, Ruihua
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (11) : 6845 - 6880
  • [45] An Effective Recursive Technique for Multi-Class Classification and Regression for Imbalanced Data
    Alam, Tahira
    Ahmed, Chowdhury Farhan
    Zahin, Sabit Anwar
    Khan, Muhammad Asif Hossain
    Islam, Maliha Tashfia
    IEEE ACCESS, 2019, 7 : 127615 - 127630
  • [46] A New Multi-Class WSVM Classification to Imbalanced Human Activity Dataset
    Abidine, M'hamed B.
    Fergani, Belkacem
    JOURNAL OF COMPUTERS, 2014, 9 (07) : 1560 - 1565
  • [47] An Effective Ensemble Method for Multi-class Classification and Regression for Imbalanced Data
    Alam, Tahira
    Ahmed, Chowdhury Farhan
    Zahin, Sabit Anwar
    Khan, Muhammad Asif Hossain
    Islam, Maliha Tashfia
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS (ICDM 2018), 2018, 10933 : 59 - 74
  • [48] What makes multi-class imbalanced problems difficult? An experimental study
    Lango, Mateusz
    Stefanowski, Jerzy
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 199
  • [49] AMDO: An Over-Sampling Technique for Multi-Class Imbalanced Problems
    Yang, Xuebing
    Kuang, Qiuming
    Zhang, Wensheng
    Zhang, Guoping
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (09) : 1672 - 1685
  • [50] Entropy-based Sampling Approaches for Multi-Class Imbalanced Problems
    Li, Lusi
    He, Haibo
    Li, Jie
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (11) : 2159 - 2170