Crowdsourcing Linked Data Quality Assessment

被引:0
|
作者
Acosta, Maribel [1 ]
Zaveri, Amrapali [2 ]
Simperl, Elena [3 ]
Kontokostas, Dimitris [2 ]
Auer, Soeren [4 ,5 ]
Lehmann, Jens [2 ]
机构
[1] Karlsruhe Inst Technol, Inst AIFB, D-76021 Karlsruhe, Germany
[2] Univ Leipzig, Inst Informat, AKSW, D-04109 Leipzig, Germany
[3] Univ Southampton, Web & Internet Sci Grp, Southampton SO9 5NH, Hants, England
[4] Univ Bonn, Enterprise Informat Syst, Bonn, Germany
[5] Univ Bonn, Fraunhofer IAIS, Bonn, Germany
来源
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we look into the use of crowdsourcing as a means to handle Linked Data quality problems that are challenging to be solved automatically. We analyzed the most common errors encountered in Linked Data sources and classified them according to the extent to which they are likely to be amenable to a specific form of crowdsourcing. Based on this analysis, we implemented a quality assessment methodology for Linked Data that leverages the wisdom of the crowds in different ways: (i) a contest targeting an expert crowd of researchers and Linked Data enthusiasts; complemented by (ii) paid microtasks published on Amazon Mechanical Turk. We empirically evaluated how this methodology could efficiently spot quality issues in DBpedia. We also investigated how the contributions of the two types of crowds could be optimally integrated into Linked Data curation processes. The results show that the two styles of crowdsourcing are complementary and that crowdsourcing-enabled quality assessment is a promising and affordable way to enhance the quality of Linked Data.
引用
收藏
页码:260 / 276
页数:17
相关论文
共 50 条
  • [31] Methodology for linked enterprise data quality assessment through information visualizations
    Gurdur, Didem
    El-khoury, Jad
    Nyberg, Mattias
    JOURNAL OF INDUSTRIAL INFORMATION INTEGRATION, 2019, 15 : 191 - 200
  • [32] A Metrics-Driven Approach for Quality Assessment of Linked Open Data
    Behkamal, Behshid
    Kahani, Mohsen
    Bagheri, Ebrahim
    Jeremic, Zoran
    JOURNAL OF THEORETICAL AND APPLIED ELECTRONIC COMMERCE RESEARCH, 2014, 9 (02): : 64 - 79
  • [33] Quality assessment in competition-based software crowdsourcing
    Hu, Zhenghui
    Wu, Wenjun
    Luo, Jie
    Wang, Xin
    Li, Boshu
    FRONTIERS OF COMPUTER SCIENCE, 2020, 14 (06)
  • [34] Quality assessment in competition-based software crowdsourcing
    Zhenghui Hu
    Wenjun Wu
    Jie Luo
    Xin Wang
    Boshu Li
    Frontiers of Computer Science, 2020, 14
  • [35] A crowdsourcing web system for curating empirical knowledge in Linked Open Data
    Mauricio Yagui, Marcela Mayumi
    Vivacqua, Adriana S.
    WEBMEDIA 2019: PROCEEDINGS OF THE 25TH BRAZILLIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB, 2019, : 441 - 444
  • [36] Noise filtering to improve data and model quality for crowdsourcing
    Li, Chaoqun
    Sheng, Victor S.
    Jiang, Liangxiao
    Li, Hongwei
    KNOWLEDGE-BASED SYSTEMS, 2016, 107 : 96 - 103
  • [37] Semantically Enriched Task and Workflow Automation in Crowdsourcing for Linked Data Management
    Basharat, Amna
    Arpinar, I. Budak
    Dastgheib, Shima
    Kursuncu, Ugur
    Kochut, Krys
    Dogdu, Erdogan
    INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2014, 8 (04) : 415 - 439
  • [38] Construction of Linked Urban Problem Data with Causal Relations using Crowdsourcing
    Egami, Shusaku
    Kawamura, Takahiro
    Kozaki, Kouji
    Ohsuga, Akihiko
    2017 6TH IIAI INTERNATIONAL CONGRESS ON ADVANCED APPLIED INFORMATICS (IIAI-AAI), 2017, : 814 - 819
  • [39] Noise correction to improve data and model quality for crowdsourcing
    Li, Chaoqun
    Jiang, Liangxiao
    Xu, Wenqiang
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 82 : 184 - 191
  • [40] Research on Data Quality Control of Crowdsourcing Annotation: A Survey
    Lu, Jian
    Li, Wei
    Wang, Qingren
    Zhang, Yiwen
    2020 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2020, : 201 - 208