Cross-Domain Data Integration for Named Entity Disambiguation in Biomedical Text

被引:0
|
作者
Varma, Maya [1 ]
Orr, Laurel [1 ]
Wu, Sen [1 ]
Leszczynski, Megan [1 ]
Ling, Xiao [2 ]
Re, Christopher [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] Apple, Cupertino, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Named entity disambiguation (NED), which involves mapping textual mentions to structured entities, is particularly challenging in the medical domain due to the presence of rare entities. Existing approaches are limited by the presence of coarse-grained structural resources in biomedical knowledge bases as well as the use of training datasets that provide low coverage over uncommon resources. In this work, we address these issues by proposing a cross-domain data integration method that transfers structural knowledge from a general text knowledge base to the medical domain. We utilize our integration scheme to augment structural resources and generate a large biomedical NED dataset for pretraining. Our pretrained model with injected structural knowledge achieves state-of-the-art performance on two benchmark medical NED datasets: MedMentions and BC5CDR. Furthermore, we improve disambiguation of rare entities by up to 57 accuracy points.
引用
收藏
页码:4566 / 4575
页数:10
相关论文
共 50 条
  • [1] Data Augmentation for Cross-Domain Named Entity Recognition
    Chen, Shuguang
    Aguilar, Gustavo
    Neves, Leonardo
    Solorio, Thamar
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 5346 - 5356
  • [2] CrossNER: Evaluating Cross-Domain Named Entity Recognition
    Liu, Zihan
    Xu, Yan
    Yu, Tiezheng
    Dai, Wenliang
    Ji, Ziwei
    Cahyawijaya, Samuel
    Madotto, Andrea
    Fung, Pascale
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 13452 - 13460
  • [3] Dynamic Gazetteer Integration in Multilingual Models for Cross-Lingual and Cross-Domain Named Entity Recognition
    Fetahu, Besnik
    Fang, Anjie
    Rokhlenko, Oleg
    Malmasi, Shervin
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2777 - 2790
  • [4] Zero-Resource Cross-Domain Named Entity Recognition
    Liu, Zihan
    Winata, Genta Indra
    Fung, Pascale
    5TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2020), 2020, : 1 - 6
  • [5] Transfer Joint Embedding for Cross-Domain Named Entity Recognition
    Pan, Sinno Jialin
    Toh, Zhiqiang
    Su, Jian
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2013, 31 (02)
  • [6] Cross-domain Named Entity Recognition via Graph Matching
    Zheng, Junhao
    Chen, Haibin
    Ma, Qianli
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 2670 - 2680
  • [7] Dual Contrastive Learning for Cross-Domain Named Entity Recognition
    Xu, Jingyun
    Yu, Junnan
    Cai, Yi
    Chua, Tat-Seng
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2024, 42 (06)
  • [8] Neural Adaptation Layers for Cross-domain Named Entity Recognition
    Lin, Bill Yuchen
    Lu, Wei
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 2012 - 2022
  • [9] IdentityRank: Named entity disambiguation in the news domain
    Fernandez, Norberto
    Arias Fisteus, Jesus
    Sanchez, Luis
    Lopez, Gonzalo
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (10) : 9207 - 9221
  • [10] Domain-Adapted Dependency Parsing for Cross-Domain Named Entity Recognition
    Dou, Chenxiao
    Sun, Xianghui
    Wang, Yaoshu
    Ji, Yunjie
    Ma, Baochang
    Li, Xiangang
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 12737 - 12744