Block Clustering for Web Pages Categorization

被引:0
|
作者
Charrad, Malika [1 ]
Lechevallier, Yves
ben Ahmed, Mohamed
Saporta, Gilbert
机构
[1] Natl Sch Comp Sci, Manouba 2010, Tunisia
来源
INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING, PROCEEDINGS | 2009年 / 5788卷
关键词
Web content mining; text mining; block clustering;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the growth of web-based applications and the increased popularity of the World Wide Web (WWW), the WWW became the greatest source of information available in the world leading to all increased difficulty of extracting relevant information. Moreover, the content of web sites is constantly changing leading to continual changes in Web users' behaviours. Therefore, there is significant interest; in analysing web content, data. to better, serve users. Our proposed approach, which is grounded oil automatic textual analysis of a web site independently from the usage attempts to define groups of documents dealing With the same topic. Both document clustering and word clustering are well studied problems. However, most existing algorithms cluster documents and words separately hut not simultaneously. In this paper, we propose to apply a block clustering algorithm to categorize a web site pages according to their content. We report results of our recent testing of CROKI2 algorithm on a tourist web site.
引用
收藏
页码:260 / +
页数:3
相关论文
共 50 条
  • [1] Research of Web Pages Categorization
    Zhongda Lin Kun Deng Yanfen Hong Department of Computer Science and Technology Nanchang University Nanchang China
    南昌工程学院学报, 2006, (02) : 107 - 111
  • [2] Research of web pages categorization
    Lin, Zhongda
    Deng, Kun
    Hong, Yanfen
    GRC: 2007 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, PROCEEDINGS, 2007, : 691 - 694
  • [3] Automatic categorization of web pages and user clustering with mixtures of hidden Markov models
    Ypma, A
    Heskes, T
    WEBKDD 2002 - MINING WEB DATA FOR DISCOVERING USAGE PATTERNS AND PROFILES, 2003, 2703 : 35 - 49
  • [4] Information categorization in web pages and sites
    Carchiolo, V. (car@diit.unict.it), 2005, IOS Press (03):
  • [5] A Study on Automatic Web Pages Categorization
    Sun Bo
    Sun Qiurui
    Chen Zhong
    Fu Zengmei
    2009 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE, VOLS 1-3, 2009, : 1423 - 1427
  • [6] Clustering Web Pages into Hierarchical Categories
    Yao, Zhongmei
    Choi, Ben
    INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2007, 3 (02) : 17 - 35
  • [7] Clustering Web pages based on their structure
    Crescenzi, V
    Merialdo, P
    Missier, P
    DATA & KNOWLEDGE ENGINEERING, 2005, 54 (03) : 279 - 299
  • [8] Clustering Web pages into hierarchial categories
    Louisiana Tech University, Ruston, LA, United States
    Int. J. Intell. Inf. Technologies, 2007, 2 (17-35):
  • [9] A Review on Web Pages Clustering Techniques
    Patel, Dipak
    Zaveri, Mukesh
    TRENDS IN NETWORKS AND COMMUNICATIONS, 2011, 197 : 700 - 710
  • [10] Text categorization of multilingual web pages in specific domain
    Liu, Jicheng
    Liang, Chunyan
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2008, 5012 : 938 - 944