Data Acquisition and Information Extraction for Scientific Knowledge Base Building

被引:2
|
作者
Andruszkiewicz, Piotr [1 ]
Rybinski, Henryk [1 ]
机构
[1] Warsaw Univ Technol, Inst Comp Sci, Warsaw, Poland
关键词
D O I
10.1109/ICSC.2018.00045
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Here we present the process of data acquisition and information extraction for building a comprehensive and accurate scientific knowledge base including conferences, publications and scientists. We use two kinds of data sources. Firstly we gather structured and reliable, but incomprehensive and not always up-to-date data sources such as digital libraries. We enrich information extracted from those sources with unstructured data obtained from the Internet by filtering websites using SVM classifier to identify potentially useful web pages. There are two potential sources of errors in the process of information enrichment. The first is the unstructured data origin and another is lack of accuracy of the machine learning methods used for data acquisition and information extraction. We address both problems by proposing a new information extraction method as well as by using crowdsourcing to correct information. Our methods are currently used in a scientific platform; namely, Omega-psi(R) university knowledge base, containing list of researchers, publications, events, etc.
引用
收藏
页码:256 / 259
页数:4
相关论文
共 50 条
  • [1] Incremental Knowledge Acquisition for building sophisticated information extraction systems with KAFTIE
    Pham, SB
    Hoffmann, A
    PRACTICAL ASPECTS OF KNOWLEDGE MANAGEMENT, PROCEEDINGS, 2004, 3336 : 292 - 306
  • [2] Muf: Tool for Knowledge Extraction and Knowledge Base Building
    Kolesa, Petr
    K-CAP'07: PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON KNOWLEDGE CAPTURE, 2007, : 191 - 192
  • [3] Towards knowledge acquisition from information extraction
    Welty, Chris
    Murdock, J. William
    SEMANTIC WEB - ISEC 2006, PROCEEDINGS, 2006, 4273 : 709 - +
  • [4] Data Knowledge Base for HENP Scientific Collaborations
    Aulov, V. A.
    Golosova, M. V.
    Grigorieva, M. A.
    Klimentov, A. A.
    Padolski, S.
    Wenaus, T.
    18TH INTERNATIONAL WORKSHOP ON ADVANCED COMPUTING AND ANALYSIS TECHNIQUES IN PHYSICS RESEARCH (ACAT2017), 2018, 1085
  • [6] Intelligent information processing for building university knowledge base
    Koperwas, Jakub
    Skonieczny, Lukasz
    Kozowski, Marek
    Andruszkiewicz, Piotr
    Rybinski, Henryk
    Struk, Wacaw
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2017, 48 (01) : 141 - 163
  • [7] Intelligent information processing for building university knowledge base
    Jakub Koperwas
    Łukasz Skonieczny
    Marek Kozłowski
    Piotr Andruszkiewicz
    Henryk Rybiński
    Wacław Struk
    Journal of Intelligent Information Systems, 2017, 48 : 141 - 163
  • [8] Information extraction for knowledge base construction in the music domain
    Oramas, Sergio
    Espinosa-Anke, Luis
    Sordo, Mohamed
    Saggion, Horacio
    Serra, Xavier
    DATA & KNOWLEDGE ENGINEERING, 2016, 106 : 70 - 83
  • [9] Expert Information Automatic Extraction for IOT Knowledge Base
    Yi, Lu
    Yuan, Rao
    Long, Sun
    Xue, Li
    2018 INTERNATIONAL CONFERENCE ON IDENTIFICATION, INFORMATION AND KNOWLEDGE IN THE INTERNET OF THINGS, 2019, 147 : 288 - 294
  • [10] Combining Information Extraction and Human Computing for Crowdsourced Knowledge Acquisition
    Kondreddi, Sarath Kumar
    Triantafillou, Peter
    Weikum, Gerhard
    2014 IEEE 30TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2014, : 988 - 999