Calculating semantic relatedness for biomedical use in a knowledge-poor environment

被引:1
|
作者
Rybinski, Maciej [1 ]
Francisco Aldana-Montes, Jose [1 ]
机构
[1] Univ Malaga, Dept LCC, Malaga 29010, Spain
来源
BMC BIOINFORMATICS | 2014年 / 15卷
关键词
SIMILARITY; DOMAIN;
D O I
10.1186/1471-2105-15-S14-S2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Computing semantic relatedness between textual labels representing biological and medical concepts is a crucial task in many automated knowledge extraction and processing applications relevant to the biomedical domain, specifically due to the huge amount of new findings being published each year. Most methods benefit from making use of highly specific resources, thus reducing their usability in many real world scenarios that differ from the original assumptions. In this paper we present a simple resource-efficient method for calculating semantic relatedness in a knowledge-poor environment. The method obtains results comparable to state-of-the-art methods, while being more generic and flexible. The solution being presented here was designed to use only a relatively generic and small document corpus and its statistics, without referring to a previously defined knowledge base, thus it does not assume a 'closed' problem. Results: We propose a method in which computation for two input texts is based on the idea of comparing the vocabulary associated with the best-fit documents related to those texts. As keyterm extraction is a costly process, it is done in a preprocessing step on a 'per-document' basis in order to limit the on-line processing. The actual computations are executed in a compact vector space, limited by the most informative extraction results. The method has been evaluated on five direct benchmarks by calculating correlation coefficients w.r.t. average human answers. It also has been used on Gene - Disease and Disease-Disease data pairs to highlight its potential use as a data analysis tool. Apart from comparisons with reported results, some interesting features of the method have been studied, i.e. the relationship between result quality, efficiency and applicable trimming threshold for size reduction. Experimental evaluation shows that the presented method obtains results that are comparable with current state of the art methods, even surpassing them on a majority of the benchmarks. Additionally, a possible usage scenario for the method is showcased with a real-world data experiment. Conclusions: Our method improves flexibility of the existing methods without a notable loss of quality. It is a legitimate alternative to the costly construction of specialized knowledge-rich resources.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] What Helps Where - And Why? Semantic Relatedness for Knowledge Transfer
    Rohrbach, Marcus
    Stark, Michael
    Szarvas, Gyoergy
    Gurevych, Iryna
    Schiele, Bernt
    2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 910 - 917
  • [32] Learning to Compute Semantic Relatedness Using Knowledge from Wikipedia
    Zheng, Chen
    Wang, Zhichun
    Bie, Rongfang
    Zhou, Mingquan
    WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014, 2014, 8709 : 236 - 246
  • [33] DATA-RICH AND KNOWLEDGE-POOR: HOW PRIVACY LAW PRIVATIZED MEDICAL DATA AND WHAT TO DO ABOUT IT
    Enriquez-Sarano, Louis
    COLUMBIA LAW REVIEW, 2020, 120 (08) : 2319 - 2357
  • [34] Semantic transference for enriching multilingual biomedical knowledge resources
    Perez, Maria
    Berlanga, Rafael
    JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 58 : 1 - 10
  • [35] Knowledge management in biomedical libraries: A semantic web approach
    Damaris Fuentes-Lorenzo
    Jorge Morato
    Juan Miguel Gómez
    Information Systems Frontiers, 2009, 11 : 471 - 480
  • [36] Knowledge management in biomedical libraries: A semantic web approach
    Fuentes-Lorenzo, Damaris
    Morato, Jorge
    Miguel Gomez, Juan
    INFORMATION SYSTEMS FRONTIERS, 2009, 11 (04) : 471 - 480
  • [37] A Decentralized Environment for Biomedical Semantic Content Authoring and Publishing
    Abbas, Asim
    Mbouadeu, Steve Fonin
    Keshtkar, Fazel
    Khattak, Hasan Ali
    Hameed, Tahir
    Bukhari, Syed Ahmad Chan
    FRONTIERS OF COMPUTER VISION, IW-FCV 2024, 2024, 2143 : 75 - 86
  • [38] A Decentralized Environment for Biomedical Semantic Content Authoring and Publishing
    Abbas, Asim
    Mbouadeu, Steve Fonin
    Keshtkar, Fazel
    Khattak, Hasan Ali
    Hameed, Tahir
    Bukhari, Syed Ahmad Chan
    CURRENT TRENDS IN WEB ENGINEERING, ICWE 2022 INTERNATIONAL WORKSHOPS, 2023, 1668 : 75 - 86
  • [39] A New Approach to Use Concepts Definitions for Semantic Relatedness Measurement
    KhounSiavash, Ehsan
    Zamanifar, Kamran
    AI 2011: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2011, 7106 : 628 - 637
  • [40] Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation
    Han, Xianpei
    Zhao, Jun
    ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2010, : 50 - 59