Linked open data-based framework for automatic biomedical ontology generation

被引:19
|
作者
Alobaidi, Mazen [1 ,2 ]
Malik, Khalid Mahmood [1 ]
Sabra, Susan [1 ]
机构
[1] Oakland Univ, Comp Sci & Engn Dept, 2200 N Squirrel Rd, Rochester, MI 48309 USA
[2] Micro Focus Int Plc, Troy, MI 48084 USA
来源
BMC BIOINFORMATICS | 2018年 / 19卷
关键词
Semantic web; Ontology generation; Linked open data; Semantic enrichment; INFORMATION; EXTRACTION; TEXT;
D O I
10.1186/s12859-018-2339-3
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Fulfilling the vision of Semantic Web requires an accurate data model for organizing knowledge and sharing common understanding of the domain. Fitting this description, ontologies are the cornerstones of Semantic Web and can be used to solve many problems of clinical information and biomedical engineering, such as word sense disambiguation, semantic similarity, question answering, ontology alignment, etc. Manual construction of ontology is labor intensive and requires domain experts and ontology engineers. To downsize the labor-intensive nature of ontology generation and minimize the need for domain experts, we present a novel automated ontology generation framework, Linked Open Data approach for Automatic Biomedical Ontology Generation (LOD-ABOG), which is empowered by Linked Open Data (LOD). LOD-ABOG performs concept extraction using knowledge base mainly UMLS and LOD, along with Natural Language Processing (NLP) operations; and applies relation extraction using LOD, Breadth first Search (BSF) graph method, and Freepal repository patterns. Results: Our evaluation shows improved results in most of the tasks of ontology generation compared to those obtained by existing frameworks. We evaluated the performance of individual tasks (modules) of proposed framework using CDR and SemMedDB datasets. For concept extraction, evaluation shows an average F-measure of 58.12% for CDR corpus and 81.68% for SemMedDB; F-measure of 65.26% and 77.44% for biomedical taxonomic relation extraction using datasets of CDR and SemMedDB, respectively; and F-measure of 52.78% and 58.12% for biomedical non-taxonomic relation extraction using CDR corpus and SemMedDB, respectively. Additionally, the comparison with manually constructed baseline Alzheimer ontology shows F-measure of 72.48% in terms of concepts detection, 76.27% in relation extraction, and 83.28% in property extraction. Also, we compared our proposed framework with ontology learning framework called "OntoGain" which shows that LOD-ABOG performs 14.76% better in terms of relation extraction. Conclusion: This paper has presented LOD-ABOG framework which shows that current LOD sources and technologies are a promising solution to automate the process of biomedical ontology generation and extract relations to a greater extent. In addition, unlike existing frameworks which require domain experts in ontology development process, the proposed approach requires involvement of them only for improvement purpose at the end of ontology life cycle.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Bridging the gap between linked open data-based recommender systems and distributed representations
    Basile, Pierpaolo
    Greco, Claudio
    Suglia, Alessandro
    Semeraro, Giovanni
    INFORMATION SYSTEMS, 2019, 86 : 1 - 8
  • [22] Overarching framework for data-based modelling
    Schelter, Bjorn
    Mader, Malenka
    Mader, Wolfgang
    Sommerlade, Linda
    Platt, Bettina
    Lai, Ying-Cheng
    Grebogi, Celso
    Thiel, Marco
    EPL, 2014, 105 (03)
  • [23] SHELDON: Semantic Holistic framEwork for LinkeD ONtology Data
    Recupero, Diego Reforgiato
    Nuzzolese, Andrea Giovanni
    Consoli, Sergio
    Gangemi, Aldo
    Presutti, Valentina
    KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT, EKAW 2014, 2015, 8982 : 136 - 139
  • [24] Open Biomedical Ontology-based Medline exploration
    Xuan, Weijian
    Dai, Manhong
    Mirel, Barbara
    Song, Jean
    Athey, Brian
    Watson, Stanley J.
    Meng, Fan
    BMC BIOINFORMATICS, 2009, 10
  • [25] Open Biomedical Ontology-based Medline exploration
    Weijian Xuan
    Manhong Dai
    Barbara Mirel
    Jean Song
    Brian Athey
    Stanley J Watson
    Fan Meng
    BMC Bioinformatics, 10
  • [26] Ontology based automatic feature recognition framework
    Wang, Qingmai
    Yu, Xinghuo
    COMPUTERS IN INDUSTRY, 2014, 65 (07) : 1041 - 1052
  • [27] Knowledge-based and data-based analysis of biomedical signals
    Cohen, ME
    Hudson, DL
    COMPUTERS AND THEIR APPLICATIONS, 2003, : 67 - 70
  • [28] Automatic Domain Identification for Linked Open Data
    Lalithsena, Sarasi
    Hitzler, Pascal
    Sheth, Amit
    Jain, Prateek
    2013 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 1, 2013, : 205 - 212
  • [29] Cloud-based automatic test data generation framework
    Chawla, Priyanka
    Chana, Inderveer
    Rana, Ajay
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2016, 82 (05) : 712 - 738
  • [30] Data-Based Automatic Discretization of Nonparametric Distributions
    Toda, Alexis Akira
    COMPUTATIONAL ECONOMICS, 2021, 57 (04) : 1217 - 1235