Content-based text classiriers for pornographic web filtering

被引:9
|
作者
Polpinij, Jantima [1 ]
Chotthanom, Anirut [1 ]
Sibunruang, Chumsak [1 ]
Chamchong, Rapeepom [1 ]
Puangpronpitag, Somnuk [1 ]
机构
[1] Mahasarakham Univ, Fac Informat, Maha Sarakham 44150, Thailand
来源
2006 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-6, PROCEEDINGS | 2006年
关键词
pornographic web filtering; text classification; Naive Bayes; support vector machines;
D O I
10.1109/ICSMC.2006.384926
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the flood of pornographic web sites on the internet, effective web filtering systems are essential. Web filtering based on content has become one of the important techniques to handle and filter inappropriate information on the web. We examine two machine learning algorithms (Support Vector Machines and Naive Bayes) for pornographic web filtering based on text content. We then focus initially on Thai-language and English-language web sites. In this paper, we aim to investigate whether machine learning algorithms are suitable for web sites classification. The empirical results show that the classifier based Support Vector Machines are more effective for pornographic web filtering than Naive Bayes classifier after testing, especially an effectiveness for the over-blocking problem.
引用
收藏
页码:1481 / +
页数:2
相关论文
共 50 条
  • [21] A Framework for Collaborative, Content-Based and Demographic Filtering
    Michael J. Pazzani
    Artificial Intelligence Review, 1999, 13 : 393 - 408
  • [22] Research on content-based text retrieval and collaborative filtering in hybrid peer-to-peer networks
    Li, SZ
    Zhou, CL
    Chen, HW
    COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN I, 2004, 3168 : 417 - 426
  • [23] Content-based Filtering with Tags: the FIRSt System
    Lops, Pasquale
    de Gemmis, Marco
    Semeraro, Giovanni
    Gissi, Paolo
    Musto, Cataldo
    Narducci, Fedelucio
    2009 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2009, : 255 - 260
  • [24] What Happened to Content-Based Information Filtering?
    Nanas, Nikolaos
    De Roeck, Anne
    Vavalis, Manolis
    ADVANCES IN INFORMATION RETRIEVAL THEORY, 2009, 5766 : 249 - 256
  • [25] An Overview of Content-Based Spam Filtering Techniques
    Khorsi, Ahmed
    INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2007, 31 (03): : 269 - 277
  • [26] A framework for collaborative, content-based and demographic filtering
    Pazzani, MJ
    ARTIFICIAL INTELLIGENCE REVIEW, 1999, 13 (5-6) : 393 - 408
  • [27] A symbolic approach for content-based information filtering
    Bezerra, BLD
    de Carvalho, FD
    INFORMATION PROCESSING LETTERS, 2004, 92 (01) : 45 - 52
  • [28] Ontological content-based filtering for personalised newspapers
    Maidel, Veronica
    Shoval, Peretz
    Shapira, Bracha
    Taieb-Maimon, Meirav
    ONLINE INFORMATION REVIEW, 2010, 34 (05) : 729 - 756
  • [29] Content-based text querying with ontological descriptors
    Andreasen, T
    Jensen, PA
    Nilsson, JF
    Paggio, P
    Pedersen, BS
    Thomsen, HE
    DATA & KNOWLEDGE ENGINEERING, 2004, 48 (02) : 199 - 219
  • [30] Content-based text categorization using Wikitology
    Rafi, M., 1600, International Journal of Computer Science Issues (IJCSI) (09): : 4 - 2