Query Prior Matters: A MRC Framework for Multimodal Named Entity Recognition

被引:20
|
作者
Jia, Meihuizi [1 ,2 ]
Shen, Xin [3 ]
Shen, Lei [2 ]
Pang, Jinhui [1 ]
Liao, Lejian [1 ]
Song, Yang [2 ]
Chen, Meng [2 ]
He, Xiaodong [2 ]
机构
[1] Beijing Inst Technol, Beijing, Peoples R China
[2] JD AI, Beijing, Peoples R China
[3] Australian Natl Univ, Canberra, ACT, Australia
基金
国家重点研发计划;
关键词
multimodal named entity recognition; machine reading comprehension; visual grounding; transfer learning;
D O I
10.1145/3503161.3548427
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Multimodal named entity recognition (MNER) is a vision-language task where the system is required to detect entity spans and corresponding entity types given a sentence-image pair. Existing methods capture text-image relations with various attention mechanisms that only obtain implicit alignments between entity types and image regions. To locate regions more accurately and better model cross-/within-modal relations, we propose a machine reading comprehension based framework for MNER, namely MRC-MNER. By utilizing queries in MRC, our framework can provide prior information about entity types and image regions. Specifically, we design two stages, Query-Guided Visual Grounding and Multi-Level Modal Interaction, to align fine-grained type-region information and simulate text-image/inner-text interactions respectively. For the former, we train a visual grounding model via transfer learning to extract region candidates that can be further integrated into the second stage to enhance token representations. For the latter, we design text-image and inner-text interaction modules along with three sub-tasks for MRC-MNER. To verify the effectiveness of our model, we conduct extensive experiments on two public MNER datasets, Twitter2015 and Twitter2017. Experimental results show that MRC-MNER outperforms the current state-of-the-art models on Twitter2017, and yields competitive results on Twitter2015.
引用
收藏
页码:3549 / 3558
页数:10
相关论文
共 50 条
  • [21] A Token-wise Graph-based Framework for Multimodal Named Entity Recognition
    Zhang, Zhengxuan
    Mai, Weixing
    Xiong, Haoliang
    Wu, Chuhan
    Xue, Yun
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2153 - 2158
  • [22] Grounded Multimodal Named Entity Recognition on Social Media
    Yu, Jianfei
    Li, Ziyan
    Wang, Jieming
    Xia, Rui
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 9141 - 9154
  • [23] GNN-Based Multimodal Named Entity Recognition
    Gong, Yunchao
    Lv, Xueqiang
    Yuan, Zhu
    You, Xindong
    Hu, Feng
    Chen, Yuzhong
    COMPUTER JOURNAL, 2024, 67 (08): : 2622 - 2632
  • [24] Named Entity Recognition Datasets: A Classification Framework
    Zhang, Ying
    Xiao, Gang
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2024, 17 (01)
  • [25] Rembrandt - a named-entity recognition framework
    Cardoso, Nuno
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1240 - 1243
  • [26] Named Entity Recognition Datasets: A Classification Framework
    Ying Zhang
    Gang Xiao
    International Journal of Computational Intelligence Systems, 17
  • [27] A framework for Named Entity Recognition in the Open domain
    Evans, RJ
    RECENT ADVANCES IN NATURAL LANGUAGE PROCESSING III, 2004, 260 : 267 - 276
  • [28] Using Search Session Context for Named Entity Recognition in Query
    Du, Junwu
    Zhang, Zhimin
    Yan, Jun
    Cui, Yan
    Chen, Zheng
    SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 765 - 766
  • [29] Improving Multimodal Named Entity Recognition via Entity Span Detection with Unified Multimodal Transformer
    Yu, Jianfei
    Jiang, Jing
    Yang, Li
    Xia, Rui
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3342 - 3352
  • [30] Enhancing Cross-Lingual Named Entity Recognition via Dual Contrastive Learning Based on MRC Framework
    Zhuo, Aiqing
    Shi, Kunli
    Gu, Jinghang
    Qian, Longhua
    Zhoul, Guodong
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT II, NLPCC 2024, 2025, 15360 : 122 - 134