Data Acquisition and Information Extraction for Scientific Knowledge Base Building

被引:2
|
作者
Andruszkiewicz, Piotr [1 ]
Rybinski, Henryk [1 ]
机构
[1] Warsaw Univ Technol, Inst Comp Sci, Warsaw, Poland
关键词
D O I
10.1109/ICSC.2018.00045
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Here we present the process of data acquisition and information extraction for building a comprehensive and accurate scientific knowledge base including conferences, publications and scientists. We use two kinds of data sources. Firstly we gather structured and reliable, but incomprehensive and not always up-to-date data sources such as digital libraries. We enrich information extracted from those sources with unstructured data obtained from the Internet by filtering websites using SVM classifier to identify potentially useful web pages. There are two potential sources of errors in the process of information enrichment. The first is the unstructured data origin and another is lack of accuracy of the machine learning methods used for data acquisition and information extraction. We address both problems by proposing a new information extraction method as well as by using crowdsourcing to correct information. Our methods are currently used in a scientific platform; namely, Omega-psi(R) university knowledge base, containing list of researchers, publications, events, etc.
引用
收藏
页码:256 / 259
页数:4
相关论文
共 50 条
  • [21] Scientific thinking and knowledge acquisition
    Kuhn, D
    MONOGRAPHS OF THE SOCIETY FOR RESEARCH IN CHILD DEVELOPMENT, 1995, 60 (04) : 152 - 157
  • [22] Conceptualization, treatment and representation of the information and data in the Metry of Information and Scientific Knowledge
    Gorbea Portal, Salvador
    Russell Barnard, Jane M.
    INFORMACION: PERSPECTIVAS BIBLIOTECOLOGICAS Y DISTINCIONES INTERDISCIPLINARIAS, 2015, : 185 - 204
  • [23] Building knowledge base from semi-structured data
    Liu, Xiao-Li
    Wu, Guo-Qing
    Yang, Min
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 839 - +
  • [24] Knowledge acquisition for model building
    Cox Jr., Louis Anthony, 1600, (08):
  • [25] BUILDING THE INFORMATION BASE
    KELLER, AE
    INFOSYSTEMS, 1979, 26 (10): : 3 - 3
  • [26] Extraction of information and knowledge production through Data Mining
    Presser, Nadi Helena
    da Silva, Eli Lopes
    NAVUS-REVISTA DE GESTAO E TECNOLOGIA, 2018, 8 (01): : 5 - 6
  • [27] KNOWLEDGE DICTIONARY FOR INFORMATION EXTRACTION ON THE ARABIC TEXT DATA
    Saputra, Wahyu Syaifullah Jauharis
    Arifin, Agus Zainal
    Yuniarti, Anny
    MAKARA JOURNAL OF TECHNOLOGY, 2012, 16 (02): : 180 - 184
  • [28] A Knowledge Acquisition Method Based on Data and Domain Scientific Knowledge and Its Application to a Sintering Process
    Shigaki, I.
    Narazaki, H.
    Denki Gakkai Ronbunshi. D, Sangyo Oyo Bumonshi, 1996, 116 (04):
  • [29] Big Data, Actionable Information, Scientific Knowledge and the Goal of Control
    Gray, Chris Hables
    TEKNOKULTURA: REVISTA DE CULTURA DIGITAL Y MOVIMIENTOS SOCIALES, 2014, 11 (03): : 529 - 554
  • [30] Automatic information extraction from texts with inference and linguistic knowledge acquisition rules
    de Araujo, Denis A.
    Rigo, Sandro J.
    Muller, Carolina
    Chishman, Rove
    2013 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY - WORKSHOPS (WI-IAT), VOL 3, 2013, : 151 - 154