Representation and Labeling Gap Bridging for Cross-lingual Named Entity Recognition

被引:2
|
作者
Zhang, Xinghua [1 ,2 ]
Yu, Bowen [3 ]
Cao, Jiangxia [1 ,2 ]
Li, Quangang [1 ]
Wang, Xuebin [1 ]
Liu, Tingwen [1 ]
Xu, Hongbo [1 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
[3] Alibaba Grp, DAMO Acad, Hangzhou, Peoples R China
关键词
Low Resource; Cross-lingual Transfer; Knowledge Acquisition;
D O I
10.1145/3539618.3591757
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cross-lingual Named Entity Recognition (NER) aims to address the challenge of data scarcity in low-resource languages by leveraging knowledge from high-resource languages. Most current work relies on general multilingual language models to represent text, and then uses classic combined tagging (e.g., B-ORG) to annotate entities; However, this approach neglects the lack of cross-lingual alignment of entity representations in language models, and also ignores the fact that entity spans and types have varying levels of labeling difficulty in terms of transferability. To address these challenges, we propose a novel framework, referred to as DLBri, which addresses the issues of representation and labeling simultaneously. Specifically, the proposed framework utilizes progressive contrastive learning with source-to-target oriented sentence pairs to pre-finetune the language model, resulting in improved cross-lingual entity-aware representations. Additionally, a decomposition-then-combination procedure is proposed, which separately transfers entity span and type, and then combines their information, to reduce the difficulty of cross-lingual entity labeling. Extensive experiments on 13 diverse language pairs confirm the effectiveness of DLBri. The code for this framework is available at https://github.com/AIRobotZhang/DLBri.
引用
收藏
页码:1230 / 1240
页数:11
相关论文
共 50 条
  • [31] An Unsupervised Multiple-Task and Multiple-Teacher Model for Cross-lingual Named Entity Recognition
    Li, Zhuoran
    Hu, Chunming
    Guo, Xiaohui
    Chen, Junfan
    Qin, Wenyi
    Zhang, Richong
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 170 - 179
  • [32] Enhancing Cross-Lingual Named Entity Recognition via Dual Contrastive Learning Based on MRC Framework
    Zhuo, Aiqing
    Shi, Kunli
    Gu, Jinghang
    Qian, Longhua
    Zhoul, Guodong
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT II, NLPCC 2024, 2025, 15360 : 122 - 134
  • [33] Cross-Lingual Cross-Domain Nested Named Entity Evaluation on EnglishWeb Texts
    Plank, Barbara
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1808 - 1815
  • [34] Unsupervised Cross-lingual Representation Learning for Speech Recognition
    Conneau, Alexis
    Baevski, Alexei
    Collobert, Ronan
    Mohamed, Abdelrahman
    Auli, Michael
    INTERSPEECH 2021, 2021, : 2426 - 2430
  • [35] PRAM: An End-to-end Prototype-based Representation Alignment Model for Zero-resource Cross-lingual Named Entity Recognition
    Huang, Yucheng
    Liu, Wenqiang
    Zhang, Xianli
    Lang, Jun
    Gong, Tieliang
    Li, Chen
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 3220 - 3233
  • [36] Discrepancy and Uncertainty Aware Denoising Knowledge Distillation for Zero-Shot Cross-Lingual Named Entity Recognition
    Ge, Ling
    Hu, Chunming
    Ma, Guanghui
    Liu, Jihong
    Zhang, Hong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 18056 - 18064
  • [37] Zero-Shot Cross-Lingual Named Entity Recognition via Progressive Multi-Teacher Distillation
    Li, Zhuoran
    Hu, Chunming
    Zhang, Richong
    Chen, Junfan
    Guo, Xiaohui
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4617 - 4630
  • [38] ProKD: An Unsupervised Prototypical Knowledge Distillation Network for Zero-Resource Cross-Lingual Named Entity Recognition
    Ge, Ling
    Hu, Chunming
    Ma, Guanghui
    Zhang, Hong
    Liu, Jihong
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 12818 - 12826
  • [39] Independent Relation Representation With Line Graph for Cross-Lingual Entity Alignment
    Zhang, Yuhong
    Wu, Jianqing
    Yu, Kui
    Wu, Xindong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (11) : 11503 - 11514
  • [40] Neural Cross-Lingual Entity Linking
    Sil, Avirup
    Kundu, Gourab
    Florian, Radu
    Hamza, Wael
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5464 - 5472