A clustering approach for XML linked documents

被引:0
|
作者
Catania, B [1 ]
Maddalena, A [1 ]
机构
[1] Univ Genoa, Genoa, Italy
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering algorithms for hypertext documents consider not only, the document content but also the links existing between them. All the similarity functions proposed in the literature assume that just one type of link exists between documents, with a unique semantic meaning. With the rapid diffusion of XML documents, a specific language, called XLink, has been proposed to specify inside XML documents different types of links. Each type of link forces a different degree of similarity between the documents on which it is defined, thus we claim it must influence in a different way the computation of distance values. In this paper, after presenting a graph-based formalization of the hypertexts we consider we introduce a distance function, based on both the number and the type of the links connecting documents. Sonic preliminary experimental results on clustering algorithms based on the proposed function conclude the paper.
引用
收藏
页码:121 / 125
页数:5
相关论文
共 50 条
  • [41] Flexible workload-aware clustering of XML documents
    Bordawekar, R
    Shmueli, O
    DATABASE AND XML TECHNOLOGIES, PROCEEDINGS, 2004, 3186 : 204 - 218
  • [42] Semantic Structural Similarity Measure for Clustering XML Documents
    Song, Ling
    Ma, Jun
    Lei, Jingsheng
    Zhang, Dongmei
    Wang, Zhen
    WEB INFORMATION SYSTEMS AND MINING, PROCEEDINGS, 2009, 5854 : 232 - +
  • [43] Overview of the INEX 2010 XML Mining Track: Clustering and Classification of XML Documents
    De Vries, Christopher M.
    Nayak, Richi
    Kutty, Sangeetha
    Geva, Shlomo
    Tagarelli, Andrea
    COMPARATIVE EVALUATION OF FOCUSED RETRIEVAL, 2011, 6932 : 363 - +
  • [44] An Efficient Association Rule Based Clustering of XML Documents
    Muralidhar, A.
    Pattabiraman, V.
    BIG DATA, CLOUD AND COMPUTING CHALLENGES, 2015, 50 : 401 - 407
  • [45] Clustering XML documents by structure based on common neighbor
    Zhang, XZ
    Lv, TY
    Wang, ZX
    Zuo, WL
    COMPUTATIONAL INTELLIGENCE AND SECURITY, PT 1, PROCEEDINGS, 2005, 3801 : 771 - 776
  • [46] Hierarchical clustering of XML documents focused on structural components
    Costa, Gianni
    Manco, Giuseppe
    Ortale, Riccardo
    Ritacco, Ettore
    DATA & KNOWLEDGE ENGINEERING, 2013, 84 : 26 - 46
  • [47] An efficient and scalable algorithm for clustering XML documents by structure
    Lian, W
    Cheung, DWL
    Mamoulis, N
    Yiu, SM
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (01) : 82 - 96
  • [48] Clustering Algorithm Based on Semantic Distance for XML Documents
    Yang, Lingxian
    Gu, Jinguang
    Chen, Heping
    FIRST INTERNATIONAL WORKSHOP ON DATABASE TECHNOLOGY AND APPLICATIONS, PROCEEDINGS, 2009, : 549 - +
  • [49] XML Documents Clustering Using a Tensor Space Model
    Kutty, Sangeetha
    Nayak, Richi
    Li, Yuefeng
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT I: 15TH PACIFIC-ASIA CONFERENCE, PAKDD 2011, 2011, 6634 : 488 - 499
  • [50] Voting Affinity Propagation Algorithm for Clustering XML Documents
    Wang, Xu
    Wei, Jinmao
    Fan, Baoquan
    Yang, Ting
    PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 1907 - 1913