DocR-BERT: Document-Level R-BERT for Chemical-Induced Disease Relation Extraction via Gaussian Probability Distribution

被引:9
|
作者
Li, Zhengguang [1 ]
Chen, Heng [1 ]
Qi, Ruihua [1 ]
Lin, Hongfei [2 ]
Chen, Huayue [3 ]
机构
[1] Dalian Univ Foreign Languages, Res Ctr Language Intelligence, Dalian 116044, Liaoning, Peoples R China
[2] Dalian Univ Technol, Coll Comp Sci & Technol, Dalian 116023, Peoples R China
[3] China West Normal Univ, Sch Comp Sci, Nanchong 637002, Peoples R China
关键词
Semantics; Feature extraction; Diseases; Data mining; Task analysis; Biological system modeling; Chemicals; Chemical-induced diseases; document-level; mutual semantic information; BERT; co-occurrence sentence;
D O I
10.1109/JBHI.2021.3116769
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Chemical-induced disease (CID) relation extraction from biomedical articles plays an important role in disease treatment and drug development. Existing methods are insufficient for capturing complete document level semantic information due to ignoring semantic information of entities in different sentences. In this work, we proposed an effective document-level relation extraction model to automatically extract intra-/inter-sentential CID relations from articles. Firstly, our model employed BERT to generate contextual semantic representations of the title, abstract and shortest dependency paths (SDPs). Secondly, to enhance the semantic representation of the whole document, cross attention with self-attention (named cross2self-attention) between abstract, title and SDPs was proposed to learn the mutual semantic information. Thirdly, to distinguish the importance of the target entity in different sentences, the Gaussian probability distribution was utilized to compute the weights of the co-occurrence sentence and its adjacent entity sentences. More complete semantic information of the target entity is collected from all entities occurring in the document via our presented document-level R-BERT (DocR-BERT). Finally, the related representations were concatenated and fed into the softmax function to extract CIDs. We evaluated the model on the CDR corpus provided by BioCreative V. The proposed model without external resources is superior in performance as compared with other state-of-the-art models (our model achieves 53.5%, 70%, and 63.7% of the F1-score on inter-/intra-sentential and overall CDR dataset). The experimental results indicate that cross2self-attention, the Gaussian probability distribution and DocR-BERT can effectively improve the CID extraction performance. Furthermore, the mutual semantic information learned by the cross self-attention from abstract towards title can significantly influence the extraction performance of document-level biomedical relation extraction tasks.
引用
收藏
页码:1341 / 1352
页数:12
相关论文
共 8 条
  • [1] Document-Level Chemical-Induced Disease Relation Extraction via Hierarchical Representation Learning
    Zhao, Weizhong
    Zhang, Jinyong
    Yang, Jincai
    Jiang, Xingpeng
    He, Tingting
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (05) : 2782 - 2793
  • [2] A Novel Document-Level Relation Extraction Method Based on BERT and Entity Information
    Han, Xiaoyu
    Wang, Lei
    IEEE ACCESS, 2020, 8 (96912-96919) : 96912 - 96919
  • [3] An Effective Framework for Document-level Chemical-induced Disease Relation Extraction via Fine-grained Interaction between Contexts
    Zhang, Jinyong
    Zhao, Weizhong
    Yang, Jincai
    Jiang, Xingpeng
    He, Tingting
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 391 - 396
  • [4] Document-Level Chemical-Induced Disease Semantic Relation Extraction Using Bidirectional Long Short-Term Memory on Dependency Graph
    Pham Thi, Quynh-Trang
    Dao, Quang Huy
    Nguyen, Anh Duc
    Dang, Thanh Hai
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2023, 16 (01)
  • [5] Document-Level Chemical-Induced Disease Semantic Relation Extraction Using Bidirectional Long Short-Term Memory on Dependency Graph
    Quynh-Trang Pham Thi
    Quang Huy Dao
    Anh Duc Nguyen
    Thanh Hai Dang
    International Journal of Computational Intelligence Systems, 16
  • [6] Chemical-induced disease relation extraction via convolutional neural network
    Gu, Jinghang
    Sun, Fuqing
    Qian, Longhua
    Zhou, Guodong
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2017,
  • [7] Chemical-induced disease relation extraction via attention-based distant supervision
    Jinghang Gu
    Fuqing Sun
    Longhua Qian
    Guodong Zhou
    BMC Bioinformatics, 20
  • [8] Chemical-induced disease relation extraction via attention-based distant supervision
    Gu, Jinghang
    Sun, Fuqing
    Qian, Longhua
    Zhou, Guodong
    BMC BIOINFORMATICS, 2019, 20 (1)