Block Clustering for Web Pages Categorization

被引:0
|
作者
Charrad, Malika [1 ]
Lechevallier, Yves
ben Ahmed, Mohamed
Saporta, Gilbert
机构
[1] Natl Sch Comp Sci, Manouba 2010, Tunisia
来源
INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING, PROCEEDINGS | 2009年 / 5788卷
关键词
Web content mining; text mining; block clustering;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the growth of web-based applications and the increased popularity of the World Wide Web (WWW), the WWW became the greatest source of information available in the world leading to all increased difficulty of extracting relevant information. Moreover, the content of web sites is constantly changing leading to continual changes in Web users' behaviours. Therefore, there is significant interest; in analysing web content, data. to better, serve users. Our proposed approach, which is grounded oil automatic textual analysis of a web site independently from the usage attempts to define groups of documents dealing With the same topic. Both document clustering and word clustering are well studied problems. However, most existing algorithms cluster documents and words separately hut not simultaneously. In this paper, we propose to apply a block clustering algorithm to categorize a web site pages according to their content. We report results of our recent testing of CROKI2 algorithm on a tourist web site.
引用
收藏
页码:260 / +
页数:3
相关论文
共 50 条
  • [21] A Hierarchical Algorithm for Clustering Extremist Web Pages
    Qi, Xingqin
    Christensen, Kyle
    Duval, Robert
    Fuller, Edgar
    Spahiu, Arian
    Wu, Qin
    Zhang, Cun-Quan
    2010 INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2010), 2010, : 458 - 463
  • [22] Clustering web pages about persons and organizations
    Ye, Shiren
    Chua, Tat-Seng
    Kei, Jeremy R.
    Web Intelligence and Agent Systems, 2005, 3 (04): : 203 - 216
  • [23] Automatic Web Pages Categorization with ReliefF and Hidden Naive Bayes
    Jin, Xin
    Li, Rongyan
    Shen, Xian
    Bie, Rongfang
    APPLIED COMPUTING 2007, VOL 1 AND 2, 2007, : 617 - 621
  • [24] Categorization of web pages based on HsMM for detecting DDoS attacks
    Xie, Yi
    Yu, Shun-Zheng
    DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2006, 13E : 2330 - 2335
  • [25] Web Document Categorization by Support Vector Clustering
    Shi, Daming
    Tsui, Ming Hei
    Liu, Jigang
    2008 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), VOLS 1-6, 2008, : 1482 - 1487
  • [26] Improving Web Search by Categorization, Clustering, and Personalization
    Zhu, Dengya
    Dreher, Heinz
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2008, 5139 : 659 - +
  • [27] Micro Genre: Building Block of Web Pages
    Kudelka, Milos
    Snasel, Vaclav
    Horak, Zdenek
    Abraham, Ajith
    NDT: 2009 FIRST INTERNATIONAL CONFERENCE ON NETWORKED DIGITAL TECHNOLOGIES, 2009,
  • [28] Indexing by Permeability in Block Structured Web Pages
    Bruno, Emmanuel
    Faessel, Nicolas
    Glotin, Herve
    Le Maitre, Jacques
    Scholl, Michel
    DOCENG'09: PROCEEDINGS OF THE 2009 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, 2009, : 70 - 73
  • [29] Exploiting Web Sites Structural and Content Features for Web Pages Clustering
    Lanotte, Pasqua Fabiana
    Fumarola, Fabio
    Malerba, Donato
    Ceci, Michelangelo
    FOUNDATIONS OF INTELLIGENT SYSTEMS, ISMIS 2017, 2017, 10352 : 446 - 456
  • [30] Clustering-based relevance feedback for web pages
    Yoo, Seung Yeol
    Hoffmann, Achim
    PRICAI 2006: TRENDS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4099 : 464 - 473