DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia

被引:1698
|
作者
Lehmann, Jens [1 ]
Isele, Robert [7 ]
Jakob, Max [5 ]
Jentzsch, Anja [4 ]
Kontokostas, Dimitris [1 ]
Mendes, Pablo N. [6 ]
Hellmann, Sebastian [1 ]
Morsey, Mohamed [1 ]
van Kleef, Patrick [3 ]
Auer, Soeren [1 ,8 ,9 ]
Bizer, Christian [2 ]
机构
[1] Univ Leipzig, Inst Comp Sci, AKSW Grp, D-04009 Leipzig, Germany
[2] Univ Mannheim, Res Grp Data & Web Sci, D-68159 Mannheim, Germany
[3] OpenLink Software, Burlington, MA 01803 USA
[4] Hasso Plattner Inst IT Syst Engn, D-14482 Potsdam, Germany
[5] Neofonie GmbH, D-10115 Berlin, Germany
[6] Wright State Univ, Kno E Sis Ohio Ctr Excellence Knowledge Enabled C, Dayton, OH 45435 USA
[7] Brox IT Solut GmbH, D-30625 Hannover, Germany
[8] Univ Bonn, Enterprise Informat Syst, D-53117 Bonn, Germany
[9] Fraunhofer IAIS, D-53117 Bonn, Germany
关键词
Knowledge extraction; Wikipedia; multilingual knowledge bases; Linked Data; RDF; LINKED DATA; SEMANTIC WEB;
D O I
10.3233/SW-140134
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The DBpedia community project extracts structured, multilingual knowledge from Wikipedia and makes it freely available on the Web using Semantic Web and Linked Data technologies. The project extracts knowledge from 111 different language editions of Wikipedia. The largest DBpedia knowledge base which is extracted from the English edition of Wikipedia consists of over 400 million facts that describe 3.7 million things. The DBpedia knowledge bases that are extracted from the other 110Wikipedia editions together consist of 1.46 billion facts and describe 10 million additional things. The DBpedia project maps Wikipedia infoboxes from 27 different language editions to a single shared ontology consisting of 320 classes and 1,650 properties. The mappings are created via a world-wide crowd-sourcing effort and enable knowledge from the different Wikipedia editions to be combined. The project publishes releases of all DBpedia knowledge bases for download and provides SPARQL query access to 14 out of the 111 language editions via a global network of local DBpedia chapters. In addition to the regular releases, the project maintains a live knowledge base which is updated whenever a page in Wikipedia changes. DBpedia sets 27 million RDF links pointing into over 30 external data sources and thus enables data from these sources to be used together with DBpedia data. Several hundred data sets on the Web publish RDF links pointing to DBpedia themselves and make DBpedia one of the central interlinking hubs in the Linked Open Data (LOD) cloud. In this system report, we give an overview of the DBpedia community project, including its architecture, technical implementation, maintenance, internationalisation, usage statistics and applications.
引用
收藏
页码:167 / 195
页数:29
相关论文
共 50 条
  • [31] A collective entity linking algorithm with parallel computing on large-scale knowledge base
    Yingchun Xia
    Xingyue Wang
    Lichuan Gu
    Qijuan Gao
    Jun Jiao
    Chao Wang
    The Journal of Supercomputing, 2020, 76 : 948 - 963
  • [32] A collective entity linking algorithm with parallel computing on large-scale knowledge base
    Xia, Yingchun
    Wang, Xingyue
    Gu, Lichuan
    Gao, Qijuan
    Jiao, Jun
    Wang, Chao
    JOURNAL OF SUPERCOMPUTING, 2020, 76 (02): : 948 - 963
  • [33] Analysis of Cluster Structure in Large-Scale English Wikipedia Category Networks
    Klaysri, Thidawan
    Fenner, Trevor
    Lachish, Oded
    Levene, Mark
    Papapetrou, Panagiotis
    ADVANCES IN INTELLIGENT DATA ANALYSIS XII, 2013, 8207 : 261 - 272
  • [34] FbMultiLingMisinfo: Challenging Large-Scale Multilingual Benchmark for Misinformation Detection
    Barnabo, Giorgio
    Siciliano, Federico
    Castillo, Carlos
    Leonardi, Stefano
    Nakov, Preslav
    Martino, Giovanni Da San
    Silvestri, Fabrizio
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [35] On the Multilingual Capabilities of Very Large-Scale English Language Models
    Armengol-Estape, Jordi
    de Gibert Bonet, Ona
    Melero, Maite
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3056 - 3068
  • [36] Construction of Encyclopedic Knowledge Base from Infobox of Indonesian Wikipedia
    Wahyudi
    Khodra, Masayu Leylia
    Wibisono, Yudi
    2018 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY SYSTEMS AND INNOVATION (ICITSI), 2018, : 542 - 546
  • [37] SiDi KWS: A Large-Scale Multilingual Dataset for Keyword Spotting
    Meneses, Michel
    Holanda, Rafael
    Peres, Luis
    Rocha, Gabriela
    INTERSPEECH 2022, 2022, : 4616 - 4620
  • [38] Analyzing Wikipedia Users' Perceived Quality of Experience: A Large-Scale Study
    Salutari, Flavia
    Da Hora, Diego
    Dubuc, Gilles
    Rossi, Dario
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2020, 17 (02): : 1082 - 1095
  • [39] WIKIREADING: A Novel Large-scale Language Understanding Task over Wikipedia
    Hewlett, Daniel
    Lacoste, Alexandre
    Jones, Llion
    Polosukhin, Illia
    Fandrianto, Andrew
    Han, Jay
    Kelcey, Matthew
    Berthelot, David
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1535 - 1545
  • [40] Dialect-to-Standard Normalization: A Large-Scale Multilingual Evaluation
    Kuparinen, Olli
    Miletic, Aleksandra
    Scherrer, Yves
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 13814 - 13828