Compacting XML documents

被引:1
|
作者
Kálmán, M [1 ]
Havasi, F [1 ]
Gyimóthy, T [1 ]
机构
[1] Dept Software Engn, H-6720 Szeged, Hungary
关键词
XML; SRML; XML compaction; XML semantics;
D O I
10.1016/j.infsof.2005.03.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, one of the most common formats for storing information is XML. The biggest drawback of XML documents is that their size is rather large compared to the information they store. XML documents may contain redundant attributes, which can be calculated from others. These redundant attributes can be deleted from the original XML document if the calculation rules can be stored somehow. In an Attribute Grammar environment there is an analog description for these rules: semantic rules. In order to use this technique in an XML environment we defined a new metalanguage called SRML. We have developed a method, which enables us to use this SRML metalanguage for compacting XML documents. After compaction it is possible to use XML compressors to make the compacted document much smaller. By using this combined approach we could achieve a significant size reduction compared to the compressed size of the XML specific compressors. This article extends the method published earlier to provide the possibility of automatically generating rules using machine learning techniques, with which it can find relationships between attributes which might not have been noticed by the user beforehand. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:90 / 106
页数:17
相关论文
共 50 条
  • [1] Compacting XML data
    Zhang, SH
    Dyreson, C
    Dang, Z
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2006, 3882 : 767 - 776
  • [2] An approach for compacting XMI documents
    Kalman, Miklos
    ACTA CYBERNETICA, 2005, 17 (02): : 289 - 310
  • [3] Clustering of XML documents
    Guillaume, D
    Murtagh, F
    COMPUTER PHYSICS COMMUNICATIONS, 2000, 127 (2-3) : 215 - 227
  • [4] Classification of XML documents
    Bouchachia, Abdelhamid
    Hassler, Marcus
    2007 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, VOLS 1 AND 2, 2007, : 390 - 396
  • [5] Merging of XML documents
    Wei, WX
    Liu, MC
    Li, SJ
    CONCEPTUAL MODELING - ER 2004, PROCEEDINGS, 2004, 3288 : 273 - 285
  • [6] Slicing XML Documents
    Silva, Josep
    ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2006, 157 (02) : 187 - 192
  • [7] Filtering of XML documents
    Ballis, D.
    Romero, D.
    SELECTED PAPERS FROM THE SECOND INTERNATIOANL WORKSHOP ON AUTOMATED SPECIFICATION AND VERIFICATION OF WEB SYSTEMS, 2007, : 19 - 26
  • [8] Naming in XML documents
    Lawrence, R
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2002: COOPLS, DOA, AND ODBASE, 2002, 2519 : 1287 - 1303
  • [9] Securing XML documents
    Damiani, E
    di Vimercati, SD
    Paraboschi, S
    Samarati, P
    ADVANCES IN DATABSE TECHNOLOGY-EDBT 2000, PROCEEDINGS, 2000, 1777 : 121 - 135
  • [10] Structuring XML documents
    Mobley, K
    TECHNICAL COMMUNICATION, 2000, 47 (02) : 253 - 255