Integrating HTML']HTML tables using semantic hierarchies and meta-data sets

被引:2
|
作者
Lim, SJ [1 ]
Ng, YK [1 ]
Yang, XC [1 ]
机构
[1] Brigham Young Univ, Dept Comp Sci, Provo, UT 84602 USA
来源
IDEAS 2002: INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM, PROCEEDINGS | 2002年
关键词
D O I
10.1109/IDEAS.2002.1029668
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As the Internet is a global network, there is a demand on accessing closely related data without browsing through different Web documents. A significant amount of these data are presented in HTML documents. Since data contents of HTML documents are intervened by markups, it is not trivial to integrate and provide a unified view of closely related data in different HTML documents. In this paper we present an approach for integrating semantically related data in any HTML tables that belong to a particular domain of interest (ID), such as house/apartment rental, by using the semantic hierarchies generated from the tables and the predefined meta-data sets that indicate related column names in ID. In our approach, we capture each data source as semi-structured data, called semantic hierarchy, and the end result of integrating different HTML tables of ID is a unified view of data in the tables, which is presented in an XML document. Besides HTML tables, our approach can be adopted by any system that integrates semi-structured data across different platforms.
引用
收藏
页码:160 / 169
页数:10
相关论文
共 50 条
  • [21] A hybrid quantum approach to leveraging data from HTML tables
    Patricia Jiménez
    Juan C. Roldán
    Rafael Corchuelo
    Knowledge and Information Systems, 2022, 64 : 441 - 474
  • [22] A Reversible Data Hiding Scheme Using Cartesian Product for HTML']HTML File
    Chou, Yung-Chen
    Huang, Chun-Yi
    Liao, Hsin-Chi
    2012 SIXTH INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING (ICGEC), 2012, : 153 - 156
  • [23] Wikxhibit: Using HTML']HTML and Wikidata to Author Applications that Link Data Across the Web
    Alrashed, Tarfah
    Verou, Lea
    Karger, David R.
    PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON USER INTERFACE SOFTWARE AND TECHNOLOGY, UIST 2022, 2022,
  • [24] Using weight-controlled token matching to extract data from HTML']HTML files
    Yan, X
    Liang, TW
    SECOND INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS ENGINEERING, VOL I, PROCEEDINGS, 2002, : 341 - 349
  • [25] Data Hiding for HTML']HTML Files Using Character Coding Table and Index Coding Table
    Chou, Yung-Chen
    Hsu, Ping-Kun
    Lin, Iuon-Chang
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2013, 7 (11): : 2913 - 2927
  • [27] Semantic search in Wiki using HTML5 microdata for semantic annotation
    Pabitha, P.
    Vignesh Nandha Kumar, K.R.
    Pandurangan, N.
    Vijayakumar, R.
    Rajaram, M.
    International Journal of Computer Science Issues, 2011, 8 (3 3-1): : 388 - 394
  • [28] Geo-Data Visualization on Online and Offline Mode of Mobile Web using HTML']HTML5
    Park, Hansaem
    Kim, Kwangseob
    Lee, Kiwon
    2016 4rth International Workshop on Earth Observation and Remote Sensing Applications (EORSA), 2016,
  • [29] Legislative meta-data based on semantic formal models
    Leuzi, V. Bartalesi
    Biagioli, C.
    Cappelli, A.
    Sprugnoli, R.
    Turchi, F.
    METADATA AND SEMANTICS, 2009, : 329 - +
  • [30] A Web-based infrastructure for the management of semantic meta-data
    Del Bianco, V
    Ripa, G
    Tracanella, E
    Lavazza, L
    NINTH IEEE INTERNATIONAL CONFERENCE ON ENGINEERING COMPLEX COMPUTER SYSTEMS, PROCEEDINGS: NAVIGATING COMPLEXITY IN THE E-ENGINEERING AGE, 2004, : 181 - 190