DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia

被引：1698

作者：

Lehmann, Jens ^{[1
]}

Isele, Robert ^{[7
]}

Jakob, Max ^{[5
]}

Jentzsch, Anja ^{[4
]}

Kontokostas, Dimitris ^{[1
]}

Mendes, Pablo N. ^{[6
]}

Hellmann, Sebastian ^{[1
]}

Morsey, Mohamed ^{[1
]}

van Kleef, Patrick ^{[3
]}

Auer, Soeren ^{[1
,8
,9
]}

Bizer, Christian ^{[2
]}

机构：

[1] Univ Leipzig, Inst Comp Sci, AKSW Grp, D-04009 Leipzig, Germany

[2] Univ Mannheim, Res Grp Data & Web Sci, D-68159 Mannheim, Germany

[3] OpenLink Software, Burlington, MA 01803 USA

[4] Hasso Plattner Inst IT Syst Engn, D-14482 Potsdam, Germany

[5] Neofonie GmbH, D-10115 Berlin, Germany

[6] Wright State Univ, Kno E Sis Ohio Ctr Excellence Knowledge Enabled C, Dayton, OH 45435 USA

[7] Brox IT Solut GmbH, D-30625 Hannover, Germany

[8] Univ Bonn, Enterprise Informat Syst, D-53117 Bonn, Germany

[9] Fraunhofer IAIS, D-53117 Bonn, Germany

来源：

SEMANTIC WEB | 2015年 / 6卷 / 02期

关键词：

Knowledge extraction; Wikipedia; multilingual knowledge bases; Linked Data; RDF; LINKED DATA; SEMANTIC WEB;

D O I：

10.3233/SW-140134

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The DBpedia community project extracts structured, multilingual knowledge from Wikipedia and makes it freely available on the Web using Semantic Web and Linked Data technologies. The project extracts knowledge from 111 different language editions of Wikipedia. The largest DBpedia knowledge base which is extracted from the English edition of Wikipedia consists of over 400 million facts that describe 3.7 million things. The DBpedia knowledge bases that are extracted from the other 110Wikipedia editions together consist of 1.46 billion facts and describe 10 million additional things. The DBpedia project maps Wikipedia infoboxes from 27 different language editions to a single shared ontology consisting of 320 classes and 1,650 properties. The mappings are created via a world-wide crowd-sourcing effort and enable knowledge from the different Wikipedia editions to be combined. The project publishes releases of all DBpedia knowledge bases for download and provides SPARQL query access to 14 out of the 111 language editions via a global network of local DBpedia chapters. In addition to the regular releases, the project maintains a live knowledge base which is updated whenever a page in Wikipedia changes. DBpedia sets 27 million RDF links pointing into over 30 external data sources and thus enables data from these sources to be used together with DBpedia data. Several hundred data sets on the Web publish RDF links pointing to DBpedia themselves and make DBpedia one of the central interlinking hubs in the Linked Open Data (LOD) cloud. In this system report, we give an overview of the DBpedia community project, including its architecture, technical implementation, maintenance, internationalisation, usage statistics and applications.

引用

页码：167 / 195

页数：29

共 50 条

[41] MINION: a Large-Scale and Diverse Dataset for Multilingual Event Detection
Ben Veyseh, Amir Pouran
Minh Van Nguyen
Dernoncourt, Franck
Thien Huu Nguyen
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2286 - 2299
[42] KNOW: Developing large-scale multilingual technologies for language understanding
Agirre, Eneko
Castellon, Irene
Padro, Lluis
Climent, Salvador
Rigau, German
Alonso, Laura
Cuadros, Montse
Coll-Florit, Marta
PROCESAMIENTO DEL LENGUAJE NATURAL, 2009, (43): : 377 - 378
[43] Mining Large-scale Event Knowledge from Web Text
Cao, Ya-nan
Zhang, Peng
Guo, Jing
Guo, Li
2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2014, 29 : 478 - 487
[44] A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal
Ghalandari, Demian Gholipour
Hokamp, Chris
Nghia The Pham
Glover, John
Ifrim, Georgiana
58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1302 - 1308
[45] Generating a Large-Scale Entity Linking Dictionary from Wikipedia Link Structure and Article Text
Harige, Ravindra
Buitelaar, Paul
LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 2431 - 2434
[46] Temporal knowledge extraction from large-scale text corpus
Yu Liu
Wen Hua
Xiaofang Zhou
World Wide Web, 2021, 24 : 135 - 156
[47] Neural Word Representations from Large-Scale Commonsense Knowledge
Chen, Jiaqiang
Tandon, Niket
de Melo, Gerard
2015 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT), VOL 1, 2015, : 225 - 228
[48] Building a large-scale commonsense knowledge base by converting an existing one in a different language
Jung, Yuchul
Lee, Joo-Young
Kim, Youngho
Park, Jaehyun
Myaeng, Sung-Hyon
Rim, Hae-Chang
COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2007, 4394 : 23 - +
[49] Refined Commonsense Knowledge From Large-Scale Web Contents
Nguyen, Tuan-Phong
Razniewski, Simon
Romero, Julien
Weikum, Gerhard
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (08) : 8431 - 8447
[50] Temporal knowledge extraction from large-scale text corpus
Liu, Yu
Hua, Wen
Zhou, Xiaofang
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2021, 24 (01): : 135 - 156

← 1 2 3 4 5 →