BioKGrapher: Initial evaluation of automated knowledge graph construction from biomedical literature

被引:0
|
作者
Schaefer, Henning [1 ,2 ]
Idrissi-Yaghir, Ahmad [2 ,3 ]
Arzideh, Kamyar [4 ]
Damm, Hendrik [2 ,3 ]
Pakull, Tabea M. G. [1 ,2 ]
Schmidt, Cynthia S. [1 ,4 ]
Bahn, Mikel [4 ]
Lodde, Georg [6 ]
Livingstone, Elisabeth [6 ]
Schadendorf, Dirk [6 ]
Nensa, Felix [4 ,5 ]
Horn, Peter A. [1 ]
Friedrich, Christoph M. [2 ,3 ]
机构
[1] Univ Hosp Essen, Inst Transfus Med, Hufelandstr 55, D-45147 Essen, Germany
[2] Univ Appl Sci & Arts Dortmund FHDO, Dept Comp Sci, Emil Figge Str 42, D-44227 Dortmund, Germany
[3] Univ Hosp Essen, Inst Med Informat Biometry & Epidemiol IMIBE, Hufelandstr 55, D-45147 Essen, Germany
[4] Univ Hosp Essen, Inst Med IKIM, Girardetstr 2, D-45131 Essen, Germany
[5] Univ Hosp Essen, Inst Intervent & Diagnost Radiol & Neuroradiol, Hufelandstr 55, D-45147 Essen, Germany
[6] Univ Hosp Essen, Dept Dermatol, Hufelandstr 55, D-45147 Essen, Germany
关键词
Knowledge graph; Named entity recognition; Entity linking; Clinical guidelines; Software; B-CELL LYMPHOMA; RITUXIMAB THERAPY; GASTROESOPHAGEAL JUNCTION; SCIENTIFIC LITERATURE; COMBINED NIVOLUMAB; CHEMOTHERAPY; HALLMARKS; CANCER; SYSTEM; TRIAL;
D O I
10.1016/j.csbj.2024.10.017
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background The growth of biomedical literature presents challenges in extracting and structuring knowledge. Knowledge Graphs (KGs) offer a solution by representing relationships between biomedical entities. However, manual construction of KGs is labor-intensive and time-consuming, highlighting the need for automated methods. This work introduces BioKGrapher, a tool for automatic KG construction using large-scale publication data, with a focus on biomedical concepts related to specific medical conditions. BioKGrapher allows researchers to construct KGs from PubMed IDs. Methods The BioKGrapher pipeline begins with Named Entity Recognition and Linking (NER+NEL) to extract and normalize biomedical concepts from PubMed, mapping them to the Unified Medical Language System (UMLS). Extracted concepts are weighted and re-ranked using Kullback-Leibler divergence and local frequency balancing. These concepts are then integrated into hierarchical KGs, with relationships formed using terminologies like SNOMED CT and NCIt. Downstream applications include multi-label document classification using Adapter- infused Transformer models. Results BioKGrapher effectively aligns generated concepts with clinical practice guidelines from the German Guideline Program in Oncology (GGPO), achieving F1-Scores of up to 0.6. In multi-label classification, Adapter- infused models using a BioKGrapher cancer-specific KG improved micro F1-Scores by up to 0.89 percentage points over a non-specific KG and 2.16 points over base models across three BERT variants. The drug-disease extraction case study identified indications for Nivolumab and Rituximab. Conclusion BioKGrapher is a tool for automatic KG construction, aligning with the GGPO and enhancing downstream task performance. It offers a scalable solution for managing biomedical knowledge, with potential applications in literature recommendation, decision support, and drug repurposing.
引用
收藏
页码:639 / 660
页数:22
相关论文
共 50 条
  • [1] An automated pipeline for standardized biomedical knowledge graph construction
    Santos-Pereira, Ana
    Vilela, Joana
    Marques, Ana Rita
    Santos, Joao Xavier
    Rasga, Celia
    Vicente, Astrid
    Martiniano, Hugo
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 671 - 671
  • [2] An automated pipeline for standardized biomedical knowledge graph construction
    Rainford, Jethro
    Ahn, Joo Wook
    Garner, Matthew
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 671 - 672
  • [3] Data Set and Evaluation of Automated Construction of Financial Knowledge Graph
    Wang, Wenguang
    Xu, Yonglin
    Du, Chunhui
    Chen, Yunwen
    Wang, Yijie
    Wen, Hui
    DATA INTELLIGENCE, 2021, 3 (03) : 418 - 443
  • [4] Data Set and Evaluation of Automated Construction of Financial Knowledge Graph
    Wenguang Wang
    Yonglin Xu
    Chunhui Du
    Yunwen Chen
    Yijie Wang
    Hui Wen
    Data Intelligence, 2021, 3 (03) : 418 - 443
  • [5] KGen: a knowledge graph generator from biomedical scientific literature
    Rossanez, Anderson
    dos Reis, Julio Cesar
    Torres, Ricardo da Silva
    de Ribaupierre, Helene
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2020, 20 (Suppl 4)
  • [6] KGen: a knowledge graph generator from biomedical scientific literature
    Anderson Rossanez
    Julio Cesar dos Reis
    Ricardo da Silva Torres
    Hélène de Ribaupierre
    BMC Medical Informatics and Decision Making, 20
  • [7] Automated Process Knowledge Graph Construction from BPMN Models
    Bachhofner, Stefan
    Kiesling, Elmar
    Revoredo, Kate
    Waibel, Philipp
    Polleres, Axel
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2022, PT I, 2022, 13426 : 32 - 47
  • [8] From biomedical knowledge graph construction to semantic querying: a comprehensive approach
    Wang, Ling
    Hao, Haoyu
    Yan, Xue
    Zhou, Tie Hua
    Ryu, Keun Ho
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [9] Information extraction and knowledge graph construction from geoscience literature
    Wang, Chengbin
    Ma, Xiaogang
    Chen, Jianguo
    Chen, Jingwen
    COMPUTERS & GEOSCIENCES, 2018, 112 : 112 - 120
  • [10] NETME: on-the-fly knowledge network construction from biomedical literature
    Muscolino, Alessandro
    Di Maria, Antonio
    Rapicavoli, Rosaria Valentina
    Alaimo, Salvatore
    Bellomo, Lorenzo
    Billeci, Fabrizio
    Borzi, Stefano
    Ferragina, Paolo
    Ferro, Alfredo
    Pulvirenti, Alfredo
    APPLIED NETWORK SCIENCE, 2022, 7 (01)