Knowledge Extraction and Semantic Annotation of Text from the Encyclopedia of Life

被引:16
|
作者
Thessen, Anne E. [1 ]
Parr, Cynthia Sims [2 ]
机构
[1] Arizona State Univ, Sch Life Sci, Tempe, AZ 85283 USA
[2] Smithsonian Inst, Natl Museum Nat Hist, Washington, DC 20560 USA
来源
PLOS ONE | 2014年 / 9卷 / 03期
关键词
D O I
10.1371/journal.pone.0089550
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Numerous digitization and ontological initiatives have focused on translating biological knowledge from narrative text to machine-readable formats. In this paper, we describe two workflows for knowledge extraction and semantic annotation of text data objects featured in an online biodiversity aggregator, the Encyclopedia of Life. One workflow tags text with DBpedia URIs based on keywords. Another workflow finds taxon names in text using GNRD for the purpose of building a species association network. Both workflows work well: the annotation workflow has an F1 Score of 0.941 and the association algorithm has an F1 Score of 0.885. Existing text annotators such as Terminizer and DBpedia Spotlight performed well, but require some optimization to be useful in the ecology and evolution domain. Important future work includes scaling up and improving accuracy through the use of distributional semantics.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Feature extraction for phenotyping from semantic and knowledge resources
    Ning, Wenxin
    Chan, Stephanie
    Beam, Andrew
    Yu, Ming
    Geva, Alon
    Liao, Katherine
    Mullen, Mary
    Mandl, Kenneth D.
    Kohane, Isaac
    Cai, Tianxi
    Yu, Sheng
    JOURNAL OF BIOMEDICAL INFORMATICS, 2019, 91
  • [42] Automatic semantic knowledge extraction from electronic forms
    Wu, Haolin
    French, Tim
    Liu, Wei
    Hodkiewicz, Melinda
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART O-JOURNAL OF RISK AND RELIABILITY, 2024, 238 (05) : 903 - +
  • [43] Text semantic understanding based on knowledge enhancement and multi-granular feature extraction
    Tang, Xianlun
    Hao, Bohui
    Dang, Xiaoyuan
    Zhong, Bing
    Wang, Runzhu
    Yan, Zhenfu
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 337 - 341
  • [44] Knowledge-supported segmentation and semantic contents extraction from MPEG videos for highlight-based annotation, indexing and retrieval
    Ren, Jinchang
    Chen, Juan
    Jiang, Jianmin
    Ipson, Stan S.
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, PROCEEDINGS: WITH ASPECTS OF THEORETICAL AND METHODOLOGICAL ISSUES, 2008, 5226 : 258 - +
  • [45] Challenges in information extraction from text for knowledge management
    Ciravegna, F
    IEEE INTELLIGENT SYSTEMS, 2001, 16 (06) : 88 - 90
  • [46] AUTOMATED EXTRACTION OF SYSTEM STRUCTURE KNOWLEDGE FROM TEXT
    Cheong, Hyunmin
    Li, Wei
    Iorio, Francesco
    PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, 2016, VOL 2A, 2016,
  • [47] Ontological Knowledge Extraction from Natural Language Text
    Zuhori, Syed Tauhid
    Zaman, Md. Asif
    Mahmud, Firoz
    2017 20TH INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2017,
  • [48] Knowledge extraction from diagram and text for media integration
    Nakamura, Y
    Takahashi, M
    Onda, M
    Ohta, Y
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, 1996, : 488 - 492
  • [49] The State of Knowledge Extraction from Text for Thai Language
    Netisopakul, Ponrudee
    Wohlgenannt, Gerhard
    2017 6TH IIAI INTERNATIONAL CONGRESS ON ADVANCED APPLIED INFORMATICS (IIAI-AAI), 2017, : 379 - 384
  • [50] Ontological knowledge extraction from natural language text
    Zuhori, Syed Tauhid
    Zaman, Md. Asif
    Mahmud, Firoz
    20th International Conference of Computer and Information Technology, ICCIT 2017, 2017, 2018-January : 1 - 6