Block Clustering for Web Pages Categorization

被引:0
|
作者
Charrad, Malika [1 ]
Lechevallier, Yves
ben Ahmed, Mohamed
Saporta, Gilbert
机构
[1] Natl Sch Comp Sci, Manouba 2010, Tunisia
来源
INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING, PROCEEDINGS | 2009年 / 5788卷
关键词
Web content mining; text mining; block clustering;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the growth of web-based applications and the increased popularity of the World Wide Web (WWW), the WWW became the greatest source of information available in the world leading to all increased difficulty of extracting relevant information. Moreover, the content of web sites is constantly changing leading to continual changes in Web users' behaviours. Therefore, there is significant interest; in analysing web content, data. to better, serve users. Our proposed approach, which is grounded oil automatic textual analysis of a web site independently from the usage attempts to define groups of documents dealing With the same topic. Both document clustering and word clustering are well studied problems. However, most existing algorithms cluster documents and words separately hut not simultaneously. In this paper, we propose to apply a block clustering algorithm to categorize a web site pages according to their content. We report results of our recent testing of CROKI2 algorithm on a tourist web site.
引用
收藏
页码:260 / +
页数:3
相关论文
共 50 条
  • [31] Method of clustering web pages based on granular computing
    Hu, Jun
    Guan, Chun
    Liu, Bocheng
    Hu, J., 2013, Asian Network for Scientific Information (13) : 2107 - 2110
  • [32] Web pages classification using domain ontology and clustering
    Soltani, Sima
    Barforoush, Ahmad Abdollahzadeh
    CIS: 2007 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY, PROCEEDINGS, 2007, : 242 - +
  • [33] AN INVESTIGATION OF CLUSTERING ALGORITHMS IN THE IDENTIFICATION OF SIMILAR WEB PAGES
    De Lucia, Andrea
    Risi, Michele
    Scanniello, Giuseppe
    Tortora, Genoveffa
    JOURNAL OF WEB ENGINEERING, 2009, 8 (04): : 346 - 370
  • [34] Clustering Web Pages Based on Structure and Style Similarity
    Gowda, Thamme
    Mattmann, Chris
    PROCEEDINGS OF 2016 IEEE 17TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IEEE IRI), 2016, : 175 - 180
  • [35] WEB PAGES CLASSIFICATION USING DOMAIN ONTOLOGY AND CLUSTERING
    Soltani, Sima
    Barforoush, Ahmad Abdollahzadeh
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2009, 23 (01) : 17 - 29
  • [36] Querying and clustering web pages about persons and organizations
    Ye, SR
    Chua, TS
    Kei, JR
    IEEE/WIC INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, PROCEEDINGS, 2003, : 344 - 350
  • [37] Partitioning-based clustering for Web document categorization
    Boley, D
    Gini, M
    Gross, R
    Han, EH
    Hastings, K
    Karypis, G
    Kumar, V
    Mobasher, B
    Moore, J
    DECISION SUPPORT SYSTEMS, 1999, 27 (03) : 329 - 341
  • [38] BlockWeb: an IR Model for Block Structured Web Pages
    Bruno, Emmanuel
    Faessel, Nicolas
    Le Maitre, Jacques
    Scholl, Michel
    CBMI: 2009 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, 2009, : 219 - +
  • [39] An efficient scheme for automatic web pages categorization using the support vector machine
    Bhalla, Vinod Kumar
    Kumar, Neeraj
    NEW REVIEW OF HYPERMEDIA AND MULTIMEDIA, 2016, 22 (03) : 223 - 242
  • [40] Fuzzy clustering method for Web user based on pages classification
    Zhan, Li-Qiang
    Liu, Da-Xin
    Wuhan University Journal of Natural Sciences, 2004, 9 (05) : 553 - 556