Compacting XML documents

被引:1
|
作者
Kálmán, M [1 ]
Havasi, F [1 ]
Gyimóthy, T [1 ]
机构
[1] Dept Software Engn, H-6720 Szeged, Hungary
关键词
XML; SRML; XML compaction; XML semantics;
D O I
10.1016/j.infsof.2005.03.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, one of the most common formats for storing information is XML. The biggest drawback of XML documents is that their size is rather large compared to the information they store. XML documents may contain redundant attributes, which can be calculated from others. These redundant attributes can be deleted from the original XML document if the calculation rules can be stored somehow. In an Attribute Grammar environment there is an analog description for these rules: semantic rules. In order to use this technique in an XML environment we defined a new metalanguage called SRML. We have developed a method, which enables us to use this SRML metalanguage for compacting XML documents. After compaction it is possible to use XML compressors to make the compacted document much smaller. By using this combined approach we could achieve a significant size reduction compared to the compressed size of the XML specific compressors. This article extends the method published earlier to provide the possibility of automatically generating rules using machine learning techniques, with which it can find relationships between attributes which might not have been noticed by the user beforehand. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:90 / 106
页数:17
相关论文
共 50 条
  • [21] Temporal versioning of XML documents
    Wuwongse, V
    Yoshikawa, M
    Amagasa, T
    DIGITAL LIBRARIES: INTERNATIONAL COLLABORATION AND CROSS-FERTILIZATION, PROCEEDINGS, 2004, 3334 : 419 - 428
  • [22] Functional dependencies in XML documents
    Yan, P
    Lv, T
    ADVANCED WEB AND NETWORK TECHNOLOGIES, AND APPLICATIONS, PROCEEDINGS, 2006, 3842 : 29 - 37
  • [23] A normal form for XML documents
    Arenas, M
    Libkin, L
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2004, 29 (01): : 195 - 232
  • [24] Collaborative Writing of XML Documents
    Skaf-Molli, Hala
    Molli, Pascal
    Rahhal, Charbel
    Naja-Jazzar, Hala
    2008 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES: FROM THEORY TO APPLICATIONS, VOLS 1-5, 2008, : 1492 - +
  • [25] Structuring XML documents.
    Gillespie, T
    LIBRARY JOURNAL, 1998, 123 (16) : 129 - 129
  • [26] Warehousing dynamic XML documents
    Rusu, Laura Irina
    Rahayu, Wenny
    Taniar, David
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4081 : 175 - 184
  • [27] XSLT querying & XML documents
    Naccarato, G
    DR DOBBS JOURNAL, 2002, 27 (12): : 24 - +
  • [28] Compression of Probabilistic XML Documents
    Veldman, Irma
    de Keijzer, Ander
    van Keulen, Maurice
    SCALABLE UNCERTAINTY MANAGEMENT, PROCEEDINGS, 2009, 5785 : 255 - 267
  • [29] Querying and indexing XML documents
    Hu, Gongzhu
    Hammad, Rafat
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2005, 5 (01) : S219 - S233
  • [30] Clustering schemaless XML documents
    Shen, Y
    Wang, B
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2003: COOPIS, DOA, AND ODBASE, 2003, 2888 : 767 - 784