Studying the XML Web: Gathering statistics from an XML sample

被引:27
|
作者
Barbosa, D
Mignet, L
Veltri, P
机构
[1] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 3G5, Canada
[2] IBM India Res Lab, New Delhi 110016, India
[3] Magna Graecia Univ Catanzaro, Dept Expt & Clin Med, I-88100 Catanzaro, Italy
[4] INRIA, Paris, France
来源
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2005年 / 8卷 / 04期
关键词
World Wide Web; XML; XML web; XML Documents; XML processing tools;
D O I
10.1007/s11280-005-1544-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
XML has emerged as the language for exchanging data on the web and has attracted considerable interest both in industry and in academia. Nevertheless, to date, little is known about the XML documents published on the web. This paper presents a comprehensive analysis of a sample of about 200,000 XML documents on the web, and is the first study of its kind. We study the distribution of XML documents across the web in several ways; moreover, we provided a detailed characterization of the structure of real XML documents. Our results provide valuable input to the design of algorithms, tools and systems that use XML in one form or another.
引用
收藏
页码:413 / 438
页数:26
相关论文
共 50 条
  • [1] Studying the XML Web: Gathering Statistics from an XML Sample
    Denilson Barbosa
    Laurent Mignet
    Pierangelo Veltri
    World Wide Web, 2006, 9 : 187 - 212
  • [2] Studying the XML Web: Gathering Statistics from an XML Sample
    Denilson Barbosa
    Laurent Mignet
    Pierangelo Veltri
    World Wide Web, 2005, 8 : 413 - 438
  • [3] Studying the XML Web: Gathering statistics from an XML sample (vol 8, pg 413, 2006)
    Barbosa, Denilson
    Mignet, Laurent
    Veltri, Pierangelo
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2006, 9 (02): : 187 - 212
  • [4] A flexible infrastructure for gathering XML statistics and estimating query cardinality
    Freire, J
    Ramanath, M
    Zhang, LZ
    20TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2004, : 857 - 857
  • [5] From XML to semantic Web
    Li, CQ
    Ling, TW
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2005, 3453 : 582 - 587
  • [6] XML data model of web based on XML
    Department of Computer Science, Xiaogan University, Xiaogan 432000, China
    Journal of Computational Information Systems, 2008, 4 (01): : 323 - 328
  • [7] XML Web documents from scratch
    Wiley, DL
    ECONTENT, 2000, 23 (04) : 94 - 94
  • [8] A Web odyssey: From Codd to XML
    Vianu, V
    SIGMOD RECORD, 2003, 32 (02) : 68 - 77
  • [9] XML security with binary XML for mobile Web services
    Kangasharju, Jaakko
    Lindholm, Tancred
    Tarkoma, Sasu
    INTERNATIONAL JOURNAL OF WEB SERVICES RESEARCH, 2008, 5 (03) : 1 - 19
  • [10] Similarity of XML schema fragments based on XML data statistics
    Mlynkova, Irena
    Pokorny, Jaroslav
    2007 INNOVATIONS IN INFORMATION TECHNOLOGIES, VOLS 1 AND 2, 2007, : 194 - 198