Leveraging Gene Ontology Annotations to Improve a Memory-Based Language Understanding System

被引:3
|
作者
Livingston, Kevin M. [1 ]
Johnson, Helen L. [1 ]
Verspoor, Karin [1 ]
Hunter, Lawrence E. [1 ]
机构
[1] Univ Colorado Denver, Ctr Computat Pharmacol, Aurora, CO 80045 USA
关键词
natural langugage processing (NLP); direct memory access parsing (DMAP); OpenDMAP; memory; Gene Ontology annotations; biological event extraction;
D O I
10.1109/ICSC.2010.62
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work evaluates how detailed knowledge about proteins can be leveraged for language understanding and disambiguation by OpenDMAP. OpenDMAP is a memory-based language understanding system that uses patterns to identify concepts in text. These patterns match not only lexical elements, such as words, but also semantic elements, such as references to proteins. This work started with an existing pattern set used to extract biological activation events from a corpus of GeneRIFs (sentences or phrases that each describe one of many of the functions of a gene). This is a challenging task because many distinct activation concepts, in addition to being semantically similar, are described using very similar language. We augment the previous approach with additional semantic knowledge about proteins, in the form of associated Gene Ontology annotations, and a small corresponding modification to the ontology used by OpenDMAP. By incorporating additional background knowledge we demonstrate that performance can be significantly improved without modifying the pattern set being used. Specifically precision is improved by 20%, at a modest 6% cost to recall. The additional semantic knowledge allows for more specificity in the ontology used by OpenDMAP, which in turn automatically improves the specificity of the patterns being used to extract knowledge from text reducing false positives by 75%.
引用
收藏
页码:40 / 45
页数:6
相关论文
共 50 条
  • [1] CvManGO, a method for leveraging computational predictions to improve literature-based Gene Ontology annotations
    Park, Julie
    Costanzo, Maria C.
    Balakrishnan, Rama
    Cherry, J. Michael
    Hong, Eurie L.
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2012,
  • [2] Memory-based language processing
    Millett, Ronald P.
    JOURNAL OF QUANTITATIVE LINGUISTICS, 2008, 15 (02) : 212 - 219
  • [3] Memory-Based Language Processing
    Stroppa, Nicolas
    MACHINE TRANSLATION, 2006, 20 (02) : 143 - 145
  • [4] Memory-Based Language Processing
    Yvon, Francois
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2006, 47 (02): : 259 - 262
  • [5] Memory-based language processing
    Kubler, Sandra
    COMPUTATIONAL LINGUISTICS, 2006, 32 (04) : 559 - 561
  • [6] Optimized memory-based messaging: Leveraging the memory system for high-performance communication
    Cheriton, DR
    Kutter, RA
    COMPUTING SYSTEMS, 1996, 9 (03): : 179 - 215
  • [7] LEVERAGING BILINEAR ATTENTION TO IMPROVE SPOKEN LANGUAGE UNDERSTANDING
    Chen, Dongsheng
    Huang, Zhiqi
    Zou, Yuexian
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7142 - 7146
  • [8] Using computational predictions to improve literature-based Gene Ontology annotations: a feasibility study
    Costanzo, Maria C.
    Park, Julie
    Balakrishnan, Rama
    Cherry, J. Michael
    Hong, Eurie L.
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2011,
  • [9] Memory-based processing in understanding causal information
    Noordman, LGM
    Vonk, W
    DISCOURSE PROCESSES, 1998, 26 (2-3) : 191 - 212
  • [10] Deciphering gene sets annotations with ontology based visualization
    Ayllon-Benitez, A.
    Thebault, P.
    Fernandez-Breis, J. T.
    Quesada-Martinez, M.
    Mougin, F.
    Bourqui, R.
    2017 21ST INTERNATIONAL CONFERENCE INFORMATION VISUALISATION (IV), 2017, : 170 - 175