Leveraging Gene Ontology Annotations to Improve a Memory-Based Language Understanding System

被引:3
|
作者
Livingston, Kevin M. [1 ]
Johnson, Helen L. [1 ]
Verspoor, Karin [1 ]
Hunter, Lawrence E. [1 ]
机构
[1] Univ Colorado Denver, Ctr Computat Pharmacol, Aurora, CO 80045 USA
关键词
natural langugage processing (NLP); direct memory access parsing (DMAP); OpenDMAP; memory; Gene Ontology annotations; biological event extraction;
D O I
10.1109/ICSC.2010.62
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work evaluates how detailed knowledge about proteins can be leveraged for language understanding and disambiguation by OpenDMAP. OpenDMAP is a memory-based language understanding system that uses patterns to identify concepts in text. These patterns match not only lexical elements, such as words, but also semantic elements, such as references to proteins. This work started with an existing pattern set used to extract biological activation events from a corpus of GeneRIFs (sentences or phrases that each describe one of many of the functions of a gene). This is a challenging task because many distinct activation concepts, in addition to being semantically similar, are described using very similar language. We augment the previous approach with additional semantic knowledge about proteins, in the form of associated Gene Ontology annotations, and a small corresponding modification to the ontology used by OpenDMAP. By incorporating additional background knowledge we demonstrate that performance can be significantly improved without modifying the pattern set being used. Specifically precision is improved by 20%, at a modest 6% cost to recall. The additional semantic knowledge allows for more specificity in the ontology used by OpenDMAP, which in turn automatically improves the specificity of the patterns being used to extract knowledge from text reducing false positives by 75%.
引用
收藏
页码:40 / 45
页数:6
相关论文
共 50 条
  • [31] Hydraulic system modeling through memory-based learning
    Krishna, M
    Bares, J
    1998 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS - PROCEEDINGS, VOLS 1-3: INNOVATIONS IN THEORY, PRACTICE AND APPLICATIONS, 1998, : 1733 - 1738
  • [32] Flash memory-based data acquisition system with NOBLE
    Nagasaka, Y
    Miyamoto, S
    Obata, T
    Sakamoto, Y
    Asai, M
    Tamura, N
    Kato, Y
    Saskamoto, H
    Ishihara, N
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2004, 51 (05) : 2069 - 2072
  • [33] Flash memory-based data acquisition system with NOBLE
    Nagasaka, Y
    Miyamoto, S
    Obata, T
    Sakamoto, Y
    Asai, M
    Tamura, N
    Kato, Y
    Saskamoto, H
    Ishihara, N
    2003 IEEE NUCLEAR SCIENCE SYMPOSIUM, CONFERENCE RECORD, VOLS 1-5, 2004, : 1332 - 1335
  • [34] PetaCache: A memory-based data-server system
    Boeheim, Chuck
    Gowdy, Stephen J.
    Hanushevsky, Andy
    Leith, David
    Melen, Randy
    Mount, Richard
    Pulliam, Teela
    Weeks, Bill
    HPDC-15: PROCEEDINGS OF THE 15TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE DISTRIBUTED COMPUTING, 2005, : 349 - 350
  • [35] Feature memory-based deep recurrent neural network for language modeling
    Deng, Hongli
    Zhang, Lei
    Shu, Xin
    APPLIED SOFT COMPUTING, 2018, 68 : 432 - 446
  • [36] Memory-based language processing: Psycholinguistic research in the 1990s
    McKoon, G
    Ratcliff, R
    ANNUAL REVIEW OF PSYCHOLOGY, 1998, 49 : 25 - 42
  • [37] Mapping the gene ontology into the unified medical language system
    Lomax, J
    McCray, AT
    COMPARATIVE AND FUNCTIONAL GENOMICS, 2004, 5 (04): : 354 - 361
  • [38] Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium
    Gaudet, Pascale
    Livstone, Michael S.
    Lewis, Suzanna E.
    Thomas, Paul D.
    BRIEFINGS IN BIOINFORMATICS, 2011, 12 (05) : 449 - 462
  • [40] The challenges of understanding mammalian cognition and memory-based behaviours: an interactive learning and memory systems approach
    McDonald, RJ
    Hong, NS
    Devan, BD
    NEUROSCIENCE AND BIOBEHAVIORAL REVIEWS, 2004, 28 (07): : 719 - 745