Content-based text classiriers for pornographic web filtering

被引:9
|
作者
Polpinij, Jantima [1 ]
Chotthanom, Anirut [1 ]
Sibunruang, Chumsak [1 ]
Chamchong, Rapeepom [1 ]
Puangpronpitag, Somnuk [1 ]
机构
[1] Mahasarakham Univ, Fac Informat, Maha Sarakham 44150, Thailand
关键词
pornographic web filtering; text classification; Naive Bayes; support vector machines;
D O I
10.1109/ICSMC.2006.384926
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the flood of pornographic web sites on the internet, effective web filtering systems are essential. Web filtering based on content has become one of the important techniques to handle and filter inappropriate information on the web. We examine two machine learning algorithms (Support Vector Machines and Naive Bayes) for pornographic web filtering based on text content. We then focus initially on Thai-language and English-language web sites. In this paper, we aim to investigate whether machine learning algorithms are suitable for web sites classification. The empirical results show that the classifier based Support Vector Machines are more effective for pornographic web filtering than Naive Bayes classifier after testing, especially an effectiveness for the over-blocking problem.
引用
收藏
页码:1481 / +
页数:2
相关论文
共 50 条
  • [1] A structural and content-based analysis for Web filtering
    Lee, PY
    Hui, SC
    Fong, ACM
    INTERNET RESEARCH, 2003, 13 (01) : 27 - 37
  • [2] Research and application of content-based adaptive text filtering system
    Sun, Tieli
    Li, Xiaowei
    Zhang, Yan
    2008 PROCEEDINGS OF INFORMATION TECHNOLOGY AND ENVIRONMENTAL SYSTEM SCIENCES: ITESS 2008, VOL 1, 2008, : 1235 - 1239
  • [3] Content-based filtering of Web documents: the MaX system and the EUFORBIA project
    Elisa Bertino
    Elena Ferrari
    Andrea Perego
    International Journal of Information Security, 2003, 2 (1) : 45 - 58
  • [4] Content-Based Spam Filtering
    Almeida, Tiago A.
    Yamakami, Akebo
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [5] Content-Based Security for the Web
    Afanasyev, Alexander
    Halderman, J. Alex
    Ruoti, Scott
    Seamons, Kent
    Yu, Yingdi
    Zappala, Daniel
    Zhang, Lixia
    PROCEEDINGS OF THE 2016 NEW SECURITY PARADIGMS WORKSHOP (NSPW'16), 2016, : 49 - 60
  • [6] SEGMENTATION OF CHINESE TEXT FOR WEB CONTENT FILTERING
    Hui, S. C.
    Fong, A. C. M.
    Hong, G. Y.
    2011 INTERNATIONAL CONFERENCE ON MECHANICAL ENGINEERING AND TECHNOLOGY (ICMET 2011), 2011, : 641 - +
  • [7] Segmentation of Chinese Text for Web Content Filtering
    Hui, S. C.
    Fong, A. C. M.
    Hong, G. Y.
    2011 INTERNATIONAL CONFERENCE ON COMPUTERS, COMMUNICATIONS, CONTROL AND AUTOMATION (CCCA 2011), VOL I, 2010, : 50 - 53
  • [8] Combining Collaborative Filtering and Semantic Content-based Approaches to Recommend Web Services
    Lecue, Freddy
    2010 IEEE FOURTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2010), 2010, : 200 - 205
  • [10] Breaking and Fixing Content-Based Filtering
    Dhiman, Mayank
    Jakobsson, Markus
    Yen, Ting-Fang
    PROCEEDINGS OF THE 2017 APWG SYMPOSIUM ON ELECTRONIC CRIME RESEARCH (ECRIME), 2017, : 52 - 56