Cross-Domain Data Integration for Named Entity Disambiguation in Biomedical Text

被引:0
|
作者
Varma, Maya [1 ]
Orr, Laurel [1 ]
Wu, Sen [1 ]
Leszczynski, Megan [1 ]
Ling, Xiao [2 ]
Re, Christopher [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] Apple, Cupertino, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Named entity disambiguation (NED), which involves mapping textual mentions to structured entities, is particularly challenging in the medical domain due to the presence of rare entities. Existing approaches are limited by the presence of coarse-grained structural resources in biomedical knowledge bases as well as the use of training datasets that provide low coverage over uncommon resources. In this work, we address these issues by proposing a cross-domain data integration method that transfers structural knowledge from a general text knowledge base to the medical domain. We utilize our integration scheme to augment structural resources and generate a large biomedical NED dataset for pretraining. Our pretrained model with injected structural knowledge achieves state-of-the-art performance on two benchmark medical NED datasets: MedMentions and BC5CDR. Furthermore, we improve disambiguation of rare entities by up to 57 accuracy points.
引用
收藏
页码:4566 / 4575
页数:10
相关论文
共 50 条
  • [31] Cross-Domain and Semisupervised Named Entity Recognition in Chinese Social Media: A Unified Model
    Xu, Jingjing
    He, Hangfeng
    Sun, Xu
    Ren, Xuancheng
    Li, Sujian
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (11) : 2142 - 2152
  • [32] POISE: Efficient Cross-Domain Chinese Named Entity Recognization via Transfer Learning
    Sheng, Jiabao
    Wumaier, Aishan
    Li, Zhe
    SYMMETRY-BASEL, 2020, 12 (10): : 1 - 16
  • [33] Arabic Named Entity Disambiguation Using Linked Open Data
    Al-Qawasmeh, Omar
    AL-Smadi, Mohammad
    Fraihat, Nisreen
    2016 7TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2016, : 333 - 338
  • [34] Named Entity Recognition From Biomedical Data
    Refaat, Maged
    Rafea, Ahmed
    Gaballah, Nada
    2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 838 - 844
  • [35] Towards reliable named entity recognition in the biomedical domain
    Giorgi, John M.
    Bader, Gary D.
    BIOINFORMATICS, 2020, 36 (01) : 280 - 286
  • [36] Standards Based Approaches for Cross-Domain Data Integration
    Atkinson, Rob
    Millard, Keiran
    Arctur, David
    INTERNATIONAL JOURNAL OF SPATIAL DATA INFRASTRUCTURES RESEARCH, 2007, 2 : 74 - 89
  • [37] HDCNN-CRF for Biomedical Text Named Entity Recognition
    Gao, Mingyuan
    Wei, Hao
    Chen, Fei
    Qu, Wen
    Lu, Mingyu
    PROCEEDINGS OF 2019 IEEE 10TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2019), 2019, : 191 - 194
  • [38] CDANER: Contrastive Learning with Cross-domain Attention for Few-shot Named Entity Recognition
    Li, Wei
    Li, Hui
    Ge, Jingguo
    Zhang, Lei
    Li, Liangxiong
    Wu, Bingzhen
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [39] Cross-Domain Named Entity Recognition of Multi-Level Structured Semantic Knowledge Enhancement
    Zhang W.
    Liu X.
    Yang G.
    Liu J.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (12): : 2864 - 2876
  • [40] Named Entity Disambiguation for Archival Collections: Metadata, Wikidata, and Linked Data
    Polley, Katherine Louise
    Tompkins, Vivian Teresa
    Honick, Brendan John
    Qin, Jian
    Proceedings of the Association for Information Science and Technology, 2021, 58 (01) : 520 - 524