Combining Naive Bayes and Tri-gram Language Model for Spam Filtering

被引：0

作者：

Ma, Xi ^{[1
]}

Shen, Yao ^{[1
]}

Chen, Junbo ^{[2
]}

Xue, Guirong ^{[2
]}

机构：

[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China

[2] Alibaba Cloud Comp, Hangzhou 310012, Peoples R China

来源：

KNOWLEDGE ENGINEERING AND MANAGEMENT | 2011年 / 123卷

关键词：

Naive Bayes; tri-gram; email anti-spam; machine learning; statistical approach;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The increasing volume of bulk unsolicited emails (also known as spam) brings huge damage to email service providers and inconvenience to individual users. Among the approaches to stop spam, Naive Bayes filter is very popular. In this paper, we propose the standard Naive Bayes combining with a In-grain language model, namely TGNB model to filter spam emails. The TGNB model solves the problem of strong independence assumption of standard Naive Bayes model. Our experiment results on three public datasets indicate that the TGNB model can achieve higher spam recall and lower false positive, and even achieve better performance than support vector machine method which is state-of-the-art on all the three datasets.

引用

页码：509 / +

页数：3

共 22 条

[1] Understanding of the Naive Bayes Classifier in Spam Filtering
Wei, Qijia
6TH INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN, MANUFACTURING, MODELING AND SIMULATION (CDMMS 2018), 2018, 1967
[2] Spam Filtering:Online Naive Bayes Based on TONE
Guanglu Sun
Hongyue Sun
Yingcai Ma
Yuewu Shen
ZTECommunications, 2013, 11 (02) : 51 - 54
[3] Combining naive Bayes and n-gram language models for text classification
Peng, FC
Schuurmans, D
ADVANCES IN INFORMATION RETRIEVAL, 2003, 2633 : 335 - 350
[4] Spam Filtering using Association Rules and Naive Bayes Classifier
Yang, Tianda
Qian, Kai
Lo, Dan Chia-Tien
Al Nasr, Kamal
Qian, Ying
PROCEEDINGS OF 2015 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATCS AND COMPUTING (IEEE PIC), 2015, : 638 - 642
[5] Web Service-enabled Spam Filtering with Naive Bayes Classification
You, Wanqing
Qian, Kai
Lo, Dan
Bhattacharya, Prahir
Guo, Minzhe
Qian, Ying
2015 IEEE FIRST INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (BIGDATASERVICE 2015), 2015, : 99 - 104
[6] Word Embedding based Multinomial Naive Bayes Algorithm for Spam Filtering
Kadam, Sumedh
Gala, Aayush
Gehlot, Pritesh
Kurup, Aditya
Ghag, Kranti
2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
[7] Label flipping attacks against Naive Bayes on spam filtering systems
Hongpo Zhang
Ning Cheng
Yang Zhang
Zhanbo Li
Applied Intelligence, 2021, 51 : 4503 - 4514
[8] Label flipping attacks against Naive Bayes on spam filtering systems
Zhang, Hongpo
Cheng, Ning
Zhang, Yang
Li, Zhanbo
APPLIED INTELLIGENCE, 2021, 51 (07) : 4503 - 4514
[9] A Support Vector Machine based Naive Bayes Algorithm for Spam Filtering
Feng, Weimiao
Sun, Jianguo
Zhang, Liguo
Cao, Cuiling
Yang, Qing
2016 IEEE 35TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2016,
[10] REVISED NAIVE BAYES CL ASSIFIER FOR COMBATING THE FOCUS ATTACK IN SPAM FILTERING
Peng, Junyan
Chan, Patrick P. K.
PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, : 610 - 614

← 1 2 3 →