BioKGrapher: Initial evaluation of automated knowledge graph construction from biomedical literature

被引:0
|
作者
Schaefer, Henning [1 ,2 ]
Idrissi-Yaghir, Ahmad [2 ,3 ]
Arzideh, Kamyar [4 ]
Damm, Hendrik [2 ,3 ]
Pakull, Tabea M. G. [1 ,2 ]
Schmidt, Cynthia S. [1 ,4 ]
Bahn, Mikel [4 ]
Lodde, Georg [6 ]
Livingstone, Elisabeth [6 ]
Schadendorf, Dirk [6 ]
Nensa, Felix [4 ,5 ]
Horn, Peter A. [1 ]
Friedrich, Christoph M. [2 ,3 ]
机构
[1] Univ Hosp Essen, Inst Transfus Med, Hufelandstr 55, D-45147 Essen, Germany
[2] Univ Appl Sci & Arts Dortmund FHDO, Dept Comp Sci, Emil Figge Str 42, D-44227 Dortmund, Germany
[3] Univ Hosp Essen, Inst Med Informat Biometry & Epidemiol IMIBE, Hufelandstr 55, D-45147 Essen, Germany
[4] Univ Hosp Essen, Inst Med IKIM, Girardetstr 2, D-45131 Essen, Germany
[5] Univ Hosp Essen, Inst Intervent & Diagnost Radiol & Neuroradiol, Hufelandstr 55, D-45147 Essen, Germany
[6] Univ Hosp Essen, Dept Dermatol, Hufelandstr 55, D-45147 Essen, Germany
来源
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL | 2024年 / 24卷
关键词
Knowledge graph; Named entity recognition; Entity linking; Clinical guidelines; Software; B-CELL LYMPHOMA; RITUXIMAB THERAPY; GASTROESOPHAGEAL JUNCTION; SCIENTIFIC LITERATURE; COMBINED NIVOLUMAB; CHEMOTHERAPY; HALLMARKS; CANCER; SYSTEM; TRIAL;
D O I
10.1016/j.csbj.2024.10.017
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background The growth of biomedical literature presents challenges in extracting and structuring knowledge. Knowledge Graphs (KGs) offer a solution by representing relationships between biomedical entities. However, manual construction of KGs is labor-intensive and time-consuming, highlighting the need for automated methods. This work introduces BioKGrapher, a tool for automatic KG construction using large-scale publication data, with a focus on biomedical concepts related to specific medical conditions. BioKGrapher allows researchers to construct KGs from PubMed IDs. Methods The BioKGrapher pipeline begins with Named Entity Recognition and Linking (NER+NEL) to extract and normalize biomedical concepts from PubMed, mapping them to the Unified Medical Language System (UMLS). Extracted concepts are weighted and re-ranked using Kullback-Leibler divergence and local frequency balancing. These concepts are then integrated into hierarchical KGs, with relationships formed using terminologies like SNOMED CT and NCIt. Downstream applications include multi-label document classification using Adapter- infused Transformer models. Results BioKGrapher effectively aligns generated concepts with clinical practice guidelines from the German Guideline Program in Oncology (GGPO), achieving F1-Scores of up to 0.6. In multi-label classification, Adapter- infused models using a BioKGrapher cancer-specific KG improved micro F1-Scores by up to 0.89 percentage points over a non-specific KG and 2.16 points over base models across three BERT variants. The drug-disease extraction case study identified indications for Nivolumab and Rituximab. Conclusion BioKGrapher is a tool for automatic KG construction, aligning with the GGPO and enhancing downstream task performance. It offers a scalable solution for managing biomedical knowledge, with potential applications in literature recommendation, decision support, and drug repurposing.
引用
收藏
页码:639 / 660
页数:22
相关论文
共 50 条
  • [41] Correction to: Mining a stroke knowledge graph from literature
    Xi Yang
    Chengkun Wu
    Goran Nenadic
    Wei Wang
    Kai Lu
    BMC Bioinformatics, 22
  • [42] Automated construction of knowledge-bases from examples
    Tam, Kar Yan
    1600, INFORMS Inst.for Operations Res.and the Management Sciences (01):
  • [43] Research on information construction of knowledge graph based on literature retrieval in english learning
    Zhao, Guolong
    TRANSINFORMACAO, 2024, 36
  • [44] Automated Construction of Knowledge-Bases from Examples
    Tam, Kar Yan
    INFORMATION SYSTEMS RESEARCH, 1990, 1 (02) : 144 - 167
  • [45] ATOM: Construction of Anti-tumor Biomaterial Knowledge Graph by Biomedicine Literature
    Wang, Tingting
    Duan, Lei
    He, Chengxin
    Deng, Geng
    Qi, Ruiyi
    Zhang, Yidan
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 1256 - 1258
  • [46] Assisted Knowledge Graph Authoring: Human-Supervised Knowledge Graph Construction from Natural Language
    Gohsen, Marcel
    Stein, Benno
    PROCEEDINGS OF THE 2024 CONFERENCE ON HUMAN INFORMATION INTERACTION AND RETRIEVAL, CHIIR 2024, 2024, : 376 - 380
  • [47] FLUTE: Fast and reliable knowledge retrieval from biomedical literature
    Holtzapple, Emilee
    Telmer, Cheryl A.
    Miskov-Zivanov, Natasa
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2020,
  • [48] Extracting knowledge from genomic experiments by incorporating the biomedical literature
    Sluka, JP
    METHODS OF MICROARRAY DATA ANALYSIS II, 2002, : 195 - 209
  • [49] Automated extraction and semantic analysis of mutation impacts from the biomedical literature
    Nona Naderi
    René Witte
    BMC Genomics, 13
  • [50] Automated extraction and semantic analysis of mutation impacts from the biomedical literature
    Naderi, Nona
    Witte, Rene
    BMC GENOMICS, 2012, 13