Multisets and clustering XML documents

被引:0
|
作者
Iyer, Swami [1 ]
Simovici, Dan A. [1 ]
机构
[1] Univ Massachusetts, Dept Comp Sci, Boston, MA 02125 USA
关键词
D O I
10.1109/ICTAI.2007.18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel and efficient solution to the problem of clustering XML documents based on their structure. We use operations on multisets of paths of document trees to define certain metrics on multisets. These metrics are used for clustering real and synthesized XML documents to produce high-quality clusterings.
引用
收藏
页码:267 / 274
页数:8
相关论文
共 50 条
  • [1] STRUCTURAL CLASSIFICATION OF XML DOCUMENTS USING MULTISETS
    Iyer, Swami
    Simovici, Dan A.
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2008, 17 (05) : 1003 - 1022
  • [2] Fuzzy multisets and fuzzy clustering of documents
    Miyamoto, S
    10TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3: MEETING THE GRAND CHALLENGE: MACHINES THAT SERVE PEOPLE, 2001, : 1539 - 1542
  • [3] Clustering of XML documents
    Guillaume, D
    Murtagh, F
    COMPUTER PHYSICS COMMUNICATIONS, 2000, 127 (2-3) : 215 - 227
  • [4] Clustering XML documents by structure
    Dalamagas, T
    Cheng, T
    Winkel, KJ
    Sellis, T
    METHODS AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3025 : 112 - 121
  • [5] Clustering XML Documents by Structure
    Lesniewska, Anna
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS, 2010, 5968 : 238 - 246
  • [6] Clustering schemaless XML documents
    Shen, Y
    Wang, B
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2003: COOPIS, DOA, AND ODBASE, 2003, 2888 : 767 - 784
  • [7] XML documents clustering by structures
    Nayak, Richi
    Xu, Sumei
    ADVANCES IN XML INFORMATION RETRIEVAL AND EVALUATION, 2006, 3977 : 432 - 442
  • [8] Semantic Clustering of XML Documents
    Tagarelli, Andrea
    Greco, Sergio
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2010, 28 (01)
  • [9] Collaborative clustering of XML documents
    Greco, Sergio
    Gullo, Francesco
    Ponti, Giovanni
    Tagarelli, Andrea
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2011, 77 (06) : 988 - 1008
  • [10] Clustering XML documents by patterns
    Piernik, Maciej
    Brzezinski, Dariusz
    Morzy, Tadeusz
    KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 46 (01) : 185 - 212