Web-Scale Extension of RDF Knowledge Bases from Templated Websites

被引:0
|
作者
Buehmann, Lorenz [1 ]
Usbeck, Ricardo [1 ,2 ]
Ngomo, Axel-Cyrille Ngonga [1 ]
Saleem, Muhammad [1 ]
Both, Andreas [2 ]
Crescenzi, Valter [3 ]
Merialdo, Paolo [3 ]
Qiu, Disheng [3 ]
机构
[1] Univ Leipzig, IFI AKSW, Leipzig, Germany
[2] Unister GmbH, Leipzig, Germany
[3] Univ Roma Tre, Rome, Italy
来源
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Only a small fraction of the information on the Web is represented as Linked Data. This lack of coverage is partly due to the paradigms followed so far to extract Linked Data. While converting structured data to RDF is well supported by tools, most approaches to extract RDF from semi-structured data rely on extraction methods based on ad-hoc solutions. In this paper, we present a holistic and open-source framework for the extraction of RDF from templated websites. We discuss the architecture of the framework and the initial implementation of each of its components. In particular, we present a novel wrapper induction technique that does not require any human supervision to detect wrappers for web sites. Our framework also includes a consistency layer with which the data extracted by the wrappers can be checked for logical consistency. We evaluate the initial version of REX on three different datasets. Our results clearly show the potential of using templated Web pages to extend the Linked Data Cloud. Moreover, our results indicate the weaknesses of our current implementations and how they can be extended.
引用
收藏
页码:66 / 81
页数:16
相关论文
共 50 条
  • [2] On Evaluating Web-Scale Extracted Knowledge Bases in a Comparative Way
    Ruan, Tong
    Zhao, Liang
    Li, Yang
    Wang, Haofen
    Dong, Xu
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2018, 14 (01) : 98 - 120
  • [3] Entity Conceptualization and Understanding based on Web-scale Knowledge Bases
    Zeng, Yi
    Hao, Hongwei
    Xu, Bo
    2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 3500 - 3504
  • [4] On the Limits of Machine Knowledge: Completeness, Recall and Negation in Web-scale Knowledge Bases
    Razniewski, Simon
    Arnaout, Hiba
    Ghosh, Shrestha
    Suchanek, Fabian
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (12): : 3175 - 3177
  • [5] Web-scale Knowledge Collection
    Lockard, Colin
    Shiralkar, Prashant
    Dong, Xin Luna
    Hajishirzi, Hannaneh
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, : 888 - 889
  • [6] A Web-scale system for scientific knowledge exploration
    Shen, Zhihong
    Ma, Hao
    Wang, Kuansan
    56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2018): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2018, : 87 - 92
  • [7] Constructing and Mining Web-Scale Knowledge Graphs
    Gabrilovich, Evgeniy
    Usunier, Nicolas
    SIGIR'16: PROCEEDINGS OF THE 39TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2016, : 1195 - 1197
  • [8] Constructing and Mining Web-Scale Knowledge Graphs
    Bordes, Antoine
    Gabrilovich, Evgeniy
    PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, : 1967 - 1967
  • [9] WSKE: A Web-Scale Spatial Knowledge Extractor
    Lee, S.
    Kim, I.
    ADVANCED SCIENCE LETTERS, 2017, 23 (12) : 12757 - 12761
  • [10] Enabling Web-Scale Knowledge Graphs Querying
    Azzam, Amr
    SEMANTIC WEB: ESWC 2020 SATELLITE EVENTS, 2020, 12124 : 229 - 239