Leveraging Gene Ontology Annotations to Improve a Memory-Based Language Understanding System

被引:3
|
作者
Livingston, Kevin M. [1 ]
Johnson, Helen L. [1 ]
Verspoor, Karin [1 ]
Hunter, Lawrence E. [1 ]
机构
[1] Univ Colorado Denver, Ctr Computat Pharmacol, Aurora, CO 80045 USA
关键词
natural langugage processing (NLP); direct memory access parsing (DMAP); OpenDMAP; memory; Gene Ontology annotations; biological event extraction;
D O I
10.1109/ICSC.2010.62
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work evaluates how detailed knowledge about proteins can be leveraged for language understanding and disambiguation by OpenDMAP. OpenDMAP is a memory-based language understanding system that uses patterns to identify concepts in text. These patterns match not only lexical elements, such as words, but also semantic elements, such as references to proteins. This work started with an existing pattern set used to extract biological activation events from a corpus of GeneRIFs (sentences or phrases that each describe one of many of the functions of a gene). This is a challenging task because many distinct activation concepts, in addition to being semantically similar, are described using very similar language. We augment the previous approach with additional semantic knowledge about proteins, in the form of associated Gene Ontology annotations, and a small corresponding modification to the ontology used by OpenDMAP. By incorporating additional background knowledge we demonstrate that performance can be significantly improved without modifying the pattern set being used. Specifically precision is improved by 20%, at a modest 6% cost to recall. The additional semantic knowledge allows for more specificity in the ontology used by OpenDMAP, which in turn automatically improves the specificity of the patterns being used to extract knowledge from text reducing false positives by 75%.
引用
收藏
页码:40 / 45
页数:6
相关论文
共 50 条
  • [21] CLUGO: A clustering algorithm for automated functional annotations based on gene ontology
    Lee, IY
    Ho, JM
    Chen, MS
    FIFTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2005, : 705 - 708
  • [22] Inferring human miRNA functional similarity based on gene ontology annotations
    Luo, Jiawei
    Dai, Di
    Cao, Buwen
    Yi, Ying
    2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 1407 - 1413
  • [23] Establishing a consensus for the hallmarks of cancer based on gene ontology and pathway annotations
    Yi Chen
    Fons. J. Verbeek
    Katherine Wolstencroft
    BMC Bioinformatics, 22
  • [24] Establishing a consensus for the hallmarks of cancer based on gene ontology and pathway annotations
    Chen, Yi
    Verbeek, Fons. J.
    Wolstencroft, Katherine
    BMC BIOINFORMATICS, 2021, 22 (01)
  • [25] A memory-based approach to learning shallow natural language patterns
    Argamon-Engelson, S
    Dagan, I
    Krymolowski, Y
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 1999, 11 (03) : 369 - 390
  • [26] Instance-family abstraction in memory-based language learning
    van den Bosch, A
    MACHINE LEARNING, PROCEEDINGS, 1999, : 39 - 48
  • [27] GOChase: correcting errors from Gene Ontology-based annotations for gene products
    Park, YR
    Park, CH
    Kim, JH
    BIOINFORMATICS, 2005, 21 (06) : 829 - 831
  • [28] An approach for natural language understanding in GIS based on ontology
    Zhou, Liguo
    Feng, Xuezhi
    She, Jiangfeng
    Me, Shunping
    GEOINFORMATICS 2007: GEOSPATIAL INFORMATION SCIENCE, PTS 1 AND 2, 2007, 6753
  • [29] Ontology-based Grounding of Spoken Language Understanding
    Quarteroni, Silvia
    Dinarelli, Marco
    Riccardi, Giuseppe
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 438 - 443
  • [30] MEMORY-BASED MICROPROCESSOR SYSTEM FOR DISCRETE MACHINE CONTROL
    KAUFMAN, BA
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS AND CONTROL INSTRUMENTATION, 1975, 22 (03): : 315 - 317