A Comparative Approach to Email Classification Using Naive Bayes Classifier and Hidden Markov Model

被引:0
|
作者
Gomes, Sebastian Romv [1 ]
Saroar, Sk Golam [1 ]
Telot, Md Mosfaiul Alam [1 ]
Khan, Behroz Newaz [1 ]
Chakrabarty, Amitabha [1 ]
Mostakim, Moin [1 ]
机构
[1] BRAC Univ, Dept Comp Sci & Engn, 66 Bir Uttam AK Khandakar Rd, Dhaka 1212, Bangladesh
关键词
Email Classification; Hidden Markov Model; Naive Bayes; Natural Language Processing; NLTK; Supervised Learning;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This research investigates a comparison between two different approaches for classifying emails based on their categories. Naive Bayes and Hidden Markov Model (HMM), two different machine learning algorithms, both have been used for detecting whether an email is important or spam. Naive Bayes Classifier is based on conditional probabilities. It is fast and works great with small dataset. It considers independent words as a feature. HMM is a generative, probabilistic model that provides us with distribution over the sequences of observations. HMMs can handle inputs of variable length and help programs come to the most likely decision, based on both previous decisions and current data. Various combinations of NLP techniques-stopwords removing, stemming, lemmatizing have been tried on both the algorithms to inspect the differences in accuracy as well as to find the best method among them.
引用
收藏
页码:482 / 487
页数:6
相关论文
共 50 条
  • [1] Internet Traffic Classification Using Hidden Naive Bayes Model
    Ghofrani, Fatemeh
    Jamshidi, Azizollah
    Keshavarz-Haddad, Alireza
    2015 23RD IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2015, : 235 - 240
  • [2] Texture Classification using Naive Bayes Classifier
    Mansour, Ayman M.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2018, 18 (01): : 112 - 120
  • [3] Comparative experiments on task classification for spoken language understanding using naive Bayes classifier
    Wu, WL
    Lu, RZ
    Liu, Z
    2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 492 - 497
  • [4] Comparative analysis of SVM and Naive Bayes classifier for the SEMG signal classification
    Narayan, Yogendra
    MATERIALS TODAY-PROCEEDINGS, 2021, 37 : 3241 - 3245
  • [5] Classification of Diabetic Patients Records Using Naive Bayes Classifier
    Thulasi, K. S.
    Ninu, E. S.
    Kumar, Shiva K. M.
    2017 2ND IEEE INTERNATIONAL CONFERENCE ON RECENT TRENDS IN ELECTRONICS, INFORMATION & COMMUNICATION TECHNOLOGY (RTEICT), 2017, : 1194 - 1198
  • [6] Classification of Article Knowledge Field using Naive Bayes Classifier
    Atmadja, Aldy Rialdy
    Irfan, Mohamad
    Halim, Abdul
    Sarbini
    PROCEEDING OF 2020 6TH INTERNATIONAL CONFERENCE ON WIRELESS AND TELEMATICS (ICWT), 2020,
  • [7] A Novel Bayes Model: Hidden Naive Bayes
    Jiang, Liangxiao
    Zhang, Harry
    Cai, Zhihua
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (10) : 1361 - 1371
  • [8] Email Spam Classification using Neighbor Probability based Naive Bayes Algorithm
    Anitha, P. U.
    Rao, C. V. Guru
    Babu, Suresh
    2017 7TH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2017, : 350 - 355
  • [9] Layered Approach for Intrusion Detection Using Naive Bayes Classifier
    Sharma, Neelam
    Mukherjee, Saurabh
    PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI'12), 2012, : 639 - 644
  • [10] Automatic Classification of Leukocytes using Morphological Features and Naive Bayes Classifier
    Gautam, Anjali
    Singh, Priyanka
    Raman, Balasubramanian
    Bhadauria, Harvendra
    PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, : 1023 - 1027