BioKGrapher: Initial evaluation of automated knowledge graph construction from biomedical literature

被引:0
|
作者
Schaefer, Henning [1 ,2 ]
Idrissi-Yaghir, Ahmad [2 ,3 ]
Arzideh, Kamyar [4 ]
Damm, Hendrik [2 ,3 ]
Pakull, Tabea M. G. [1 ,2 ]
Schmidt, Cynthia S. [1 ,4 ]
Bahn, Mikel [4 ]
Lodde, Georg [6 ]
Livingstone, Elisabeth [6 ]
Schadendorf, Dirk [6 ]
Nensa, Felix [4 ,5 ]
Horn, Peter A. [1 ]
Friedrich, Christoph M. [2 ,3 ]
机构
[1] Univ Hosp Essen, Inst Transfus Med, Hufelandstr 55, D-45147 Essen, Germany
[2] Univ Appl Sci & Arts Dortmund FHDO, Dept Comp Sci, Emil Figge Str 42, D-44227 Dortmund, Germany
[3] Univ Hosp Essen, Inst Med Informat Biometry & Epidemiol IMIBE, Hufelandstr 55, D-45147 Essen, Germany
[4] Univ Hosp Essen, Inst Med IKIM, Girardetstr 2, D-45131 Essen, Germany
[5] Univ Hosp Essen, Inst Intervent & Diagnost Radiol & Neuroradiol, Hufelandstr 55, D-45147 Essen, Germany
[6] Univ Hosp Essen, Dept Dermatol, Hufelandstr 55, D-45147 Essen, Germany
来源
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL | 2024年 / 24卷
关键词
Knowledge graph; Named entity recognition; Entity linking; Clinical guidelines; Software; B-CELL LYMPHOMA; RITUXIMAB THERAPY; GASTROESOPHAGEAL JUNCTION; SCIENTIFIC LITERATURE; COMBINED NIVOLUMAB; CHEMOTHERAPY; HALLMARKS; CANCER; SYSTEM; TRIAL;
D O I
10.1016/j.csbj.2024.10.017
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background The growth of biomedical literature presents challenges in extracting and structuring knowledge. Knowledge Graphs (KGs) offer a solution by representing relationships between biomedical entities. However, manual construction of KGs is labor-intensive and time-consuming, highlighting the need for automated methods. This work introduces BioKGrapher, a tool for automatic KG construction using large-scale publication data, with a focus on biomedical concepts related to specific medical conditions. BioKGrapher allows researchers to construct KGs from PubMed IDs. Methods The BioKGrapher pipeline begins with Named Entity Recognition and Linking (NER+NEL) to extract and normalize biomedical concepts from PubMed, mapping them to the Unified Medical Language System (UMLS). Extracted concepts are weighted and re-ranked using Kullback-Leibler divergence and local frequency balancing. These concepts are then integrated into hierarchical KGs, with relationships formed using terminologies like SNOMED CT and NCIt. Downstream applications include multi-label document classification using Adapter- infused Transformer models. Results BioKGrapher effectively aligns generated concepts with clinical practice guidelines from the German Guideline Program in Oncology (GGPO), achieving F1-Scores of up to 0.6. In multi-label classification, Adapter- infused models using a BioKGrapher cancer-specific KG improved micro F1-Scores by up to 0.89 percentage points over a non-specific KG and 2.16 points over base models across three BERT variants. The drug-disease extraction case study identified indications for Nivolumab and Rituximab. Conclusion BioKGrapher is a tool for automatic KG construction, aligning with the GGPO and enhancing downstream task performance. It offers a scalable solution for managing biomedical knowledge, with potential applications in literature recommendation, decision support, and drug repurposing.
引用
收藏
页码:639 / 660
页数:22
相关论文
共 50 条
  • [31] Biomedical Knowledge Graphs Construction From Conditional Statements
    Jiang, Tianwen
    Zeng, Qingkai
    Zhao, Tong
    Qin, Bing
    Liu, Ting
    Chawla, Nitesh, V
    Jiang, Meng
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (03) : 823 - 835
  • [32] Mining a stroke knowledge graph from literature
    Yang, Xi
    Wu, Chengkun
    Nenadic, Goran
    Wang, Wei
    Lu, Kai
    BMC BIOINFORMATICS, 2021, 22 (SUPPL 10)
  • [33] Mining a stroke knowledge graph from literature
    Xi Yang
    Chengkun Wu
    Goran Nenadic
    Wei Wang
    Kai Lu
    BMC Bioinformatics, 22
  • [34] SAKA: an intelligent platform for semi-automated knowledge graph construction and application
    Zhang, Hanrong
    Wang, Xinyue
    Pan, Jiabao
    Wang, Hongwei
    SERVICE ORIENTED COMPUTING AND APPLICATIONS, 2023, 17 (03) : 201 - 212
  • [35] The construction of knowledge from the scientific literature about the theme seaport performance evaluation
    Dutra, Ademar
    Mateo Ripoll-Feliu, Vicente
    Giner Fillol, Arturo
    Rolim Ensslin, Sandra
    Ensslin, Leonardo
    INTERNATIONAL JOURNAL OF PRODUCTIVITY AND PERFORMANCE MANAGEMENT, 2015, 64 (02) : 243 - 269
  • [36] SAKA: an intelligent platform for semi-automated knowledge graph construction and application
    Hanrong Zhang
    Xinyue Wang
    Jiabao Pan
    Hongwei Wang
    Service Oriented Computing and Applications, 2023, 17 : 201 - 212
  • [37] NetMe 2.0: a web-based platform for extracting and modeling knowledge from biomedical literature as a labeled graph
    Di Maria, Antonio
    Bellomo, Lorenzo
    Billeci, Fabrizio
    Cardillo, Alfio
    Alaimo, Salvatore
    Ferragina, Paolo
    Ferro, Alfredo
    Pulvirenti, Alfredo
    BIOINFORMATICS, 2024, 40 (05)
  • [38] A Framework To Build A Causal Knowledge Graph for Chronic Diseases and Cancers By Discovering Semantic Associations from Biomedical Literature
    Daowd, Ali
    Barrett, Michael
    Abidi, Samina
    Abidi, Syed Sibte Raza
    2021 IEEE 9TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2021), 2021, : 13 - 22
  • [39] Knowledge graph construction from multiple online encyclopedias
    Wu, Tianxing
    Wang, Haofen
    Li, Cheng
    Qi, Guilin
    Niu, Xing
    Wang, Meng
    Li, Lin
    Shi, Chaomin
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2020, 23 (05): : 2671 - 2698
  • [40] Knowledge graph construction from multiple online encyclopedias
    Tianxing Wu
    Haofen Wang
    Cheng Li
    Guilin Qi
    Xing Niu
    Meng Wang
    Lin Li
    Chaomin Shi
    World Wide Web, 2020, 23 : 2671 - 2698