Integrating deep learning architectures for enhanced biomedical relation extraction: a pipeline approach

被引:0
|
作者
Sarol, M. Janina [1 ]
Hong, Gibong [2 ]
Guerra, Evan [2 ]
Kilicoglu, Halil [2 ]
机构
[1] Univ Illinois, Informat Programs, 614 E Daniel St, Champaign, IL 61820 USA
[2] Univ Illinois, Sch Informat Sci, 501 Daniel St, Champaign, IL 61820 USA
来源
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION | 2024年 / 2024卷
基金
美国国家卫生研究院;
关键词
NORMALIZATION; RECOGNITION; RESOURCE; CORPUS; ENTITY;
D O I
10.1093/database/baae079
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Biomedical relation extraction from scientific publications is a key task in biomedical natural language processing (NLP) and can facilitate the creation of large knowledge bases, enable more efficient knowledge discovery, and accelerate evidence synthesis. In this paper, building upon our previous effort in the BioCreative VIII BioRED Track, we propose an enhanced end-to-end pipeline approach for biomedical relation extraction (RE) and novelty detection (ND) that effectively leverages existing datasets and integrates state-of-the-art deep learning methods. Our pipeline consists of four tasks performed sequentially: named entity recognition (NER), entity linking (EL), RE, and ND. We trained models using the BioRED benchmark corpus that was the basis of the shared task. We explored several methods for each task and combinations thereof: for NER, we compared a BERT-based sequence labeling model that uses the BIO scheme with a span classification model. For EL, we trained a convolutional neural network model for diseases and chemicals and used an existing tool, PubTator 3.0, for mapping other entity types. For RE and ND, we adapted the BERT-based, sentence-bound PURE model to bidirectional and document-level extraction. We also performed extensive hyperparameter tuning to improve model performance. We obtained our best performance using BERT-based models for NER, RE, and ND, and the hybrid approach for EL. Our enhanced and optimized pipeline showed substantial improvement compared to our shared task submission, NER: 93.53 (+3.09), EL: 83.87 (+9.73), RE: 46.18 (+15.67), and ND: 38.86 (+14.9). While the performances of the NER and EL models are reasonably high, RE and ND tasks remain challenging at the document level. Further enhancements to the dataset could enable more accurate and useful models for practical use. We provide our models and code at https://github.com/janinaj/e2eBioMedRE/.Database URL: https://github.com/janinaj/e2eBioMedRE/
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Deep learning models for spatial relation extraction in text
    Wu, Kehan
    Zhang, Xueying
    Dang, Yulong
    Ye, Peng
    GEO-SPATIAL INFORMATION SCIENCE, 2023, 26 (01) : 58 - 70
  • [42] Syntax-based transfer learning for the task of biomedical relation extraction
    Legrand, Joel
    Toussaint, Yannick
    Raissi, Chedy
    Coulet, Adrien
    JOURNAL OF BIOMEDICAL SEMANTICS, 2021, 12 (01)
  • [43] Biomedical relation extraction method based on ensemble learning and attention mechanism
    Jia, Yaxun
    Wang, Haoyang
    Yuan, Zhu
    Zhu, Lian
    Xiang, Zuo-lin
    BMC BIOINFORMATICS, 2024, 25 (01):
  • [44] Syntax-based transfer learning for the task of biomedical relation extraction
    Joël Legrand
    Yannick Toussaint
    Chedy Raïssi
    Adrien Coulet
    Journal of Biomedical Semantics, 12
  • [45] Context and Type Enhanced Representation Learning for Relation Extraction
    Yu, Erxin
    Jia, Yantao
    Wang, Shang
    Li, Fengfu
    Chang, Yi
    11TH IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH (ICKG 2020), 2020, : 329 - 335
  • [46] TERL: Transformer Enhanced Reinforcement Learning for Relation Extraction
    Wang, Yashen
    Shi, Tuo
    Ouyang, Xiaoye
    Guo, Dayu
    CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 192 - 206
  • [47] Is Prompt the Future? A Survey of Evolution of Relation Extraction Approach Using Deep Learning and Big Data
    Zhu, Zhen
    Wang, Liting
    Gu, Dongmei
    Wu, Hong
    Janfada, Behrooz
    Minaei-Bidgoli, Behrouz
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGIES AND SYSTEMS APPROACH, 2023, 16 (01) : 1172 - 1189
  • [48] Semantic relation extraction for herb-drug interactions from the biomedical literature using an unsupervised learning approach
    Duc Khang Trinh
    Truong Duy Pham
    Ly Le
    PROCEEDINGS 2018 IEEE 18TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2018, : 334 - 337
  • [49] A Syntax-enhanced model based on category keywords for biomedical relation extraction
    Liu, Xiaofeng
    Tan, Jiajie
    Fan, Jianye
    Tan, Kaiwen
    Hu, Jinlong
    Dong, Shoubin
    JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 132
  • [50] A Deep Learning Approach to Contract Element Extraction
    Chalkidis, Ilias
    Androutsopoulos, Ion
    LEGAL KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 302 : 155 - 164