Leveraging Multi-Token Entities in Document-Level Named Entity Recognition

被引:0
|
作者
Hu, Anwen [2 ,3 ]
Dou, Zhicheng [1 ,2 ]
Nie, Jian-Yun [5 ]
Wen, Ji-Rong [3 ,4 ]
机构
[1] Renmin Univ China, Gaoling Sch Artificial Intelligence, Beijing, Peoples R China
[2] Renmin Univ China, Sch Informat, Beijing, Peoples R China
[3] Beijing Key Lab Big Data Management & Anal Method, Beijing, Peoples R China
[4] MOE, Key Lab Data Engn & Knowledge Engn, Beijing, Peoples R China
[5] Univ Montreal, Dept Comp Sci & Operat Res, Montreal, PQ, Canada
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most state-of-the-art named entity recognition systems are designed to process each sentence within a document independently. These systems are easy to confuse entity types when the context information in a sentence is not sufficient enough. To utilize the context information within the whole document, most document-level work let neural networks on their own to learn the relation across sentences, which is not intuitive enough for us humans. In this paper, we divide entities to multi-token entities that contain multiple tokens and single-token entities that are composed of a single token. We propose that the context information of multi-token entities should be more reliable in document-level NER for news articles. We design a fusion attention mechanism which not only learns the semantic relevance between occurrences of the same token, but also focuses more on occurrences belonging to multi-tokens entities. To identify multi-token entities, we design an auxiliary task namely 'Multi-token Entity Classification' and perform this task simultaneously with document-level NER. This auxiliary task is simplified from NER and doesn't require extra annotation. Experimental results on the CoNLL-2003 dataset and OntoNotes(nbm) dataset show that our model outperforms state-of-the-art sentence-level and document-level NER methods.
引用
收藏
页码:7961 / 7968
页数:8
相关论文
共 50 条
  • [1] Leveraging Document-Level Label Consistency for Named Entity Recognition
    Gui, Tao
    Ye, Jiacheng
    Zhang, Qi
    Zhou, Yaqian
    Gong, Yeyun
    Huang, Xuanjing
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3976 - 3982
  • [2] Document-Level Named Entity Recognition with Q-Network
    Lu, Tingming
    Gui, Yaocheng
    Gao, Zhiqiang
    PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2019, 11672 : 164 - 178
  • [3] Span Graph Transformer for Document-Level Named Entity Recognition
    Mao, Hongli
    Mao, Xian-Ling
    Tang, Hanlin
    Shang, Yu-Ming
    Huang, Heyan
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 18769 - 18777
  • [4] Exploiting global contextual information for document-level named entity recognition
    Yu, Yiting
    Wang, Zanbo
    Wei, Wei
    Zhang, Ruihan
    Mao, Xian-Ling
    Feng, Shanshan
    Wang, Fei
    He, Zhiyong
    Jiang, Sheng
    KNOWLEDGE-BASED SYSTEMS, 2024, 284
  • [5] Domain-specific Named Entity Recognition with Document-Level Optimization
    Wang, Limin
    Li, Shoushan
    Yan, Qian
    Zhou, Guodong
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2018, 17 (04)
  • [6] Consistency enhancement of model prediction on document-level named entity recognition
    Jeong, Minbyul
    Kang, Jaewoo
    BIOINFORMATICS, 2023, 39 (06)
  • [7] Document-Level Named Entity Recognition by Incorporating Global and Neighbor Features
    Hu, Anwen
    Dou, Zhicheng
    Wen, Ji-rong
    INFORMATION RETRIEVAL (CCIR 2019), 2019, 11772 : 79 - 91
  • [8] DocBAN: An Efficient Biaffine Attention Network for Document-Level Named Entity Recognition
    Wu, Hao
    Li, Xianxian
    Yang, Danping
    Zhou, Aoxiang
    Wang, Peng
    Liu, Peng
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14877 : 65 - 76
  • [9] Context-aware multi-token concept recognition of biological entities
    Kwangmin Kim
    Doheon Lee
    BMC Bioinformatics, 22
  • [10] Context-aware multi-token concept recognition of biological entities
    Kim, Kwangmin
    Lee, Doheon
    BMC BIOINFORMATICS, 2021, 22 (SUPPL 11)