Query Prior Matters: A MRC Framework for Multimodal Named Entity Recognition

被引:20
|
作者
Jia, Meihuizi [1 ,2 ]
Shen, Xin [3 ]
Shen, Lei [2 ]
Pang, Jinhui [1 ]
Liao, Lejian [1 ]
Song, Yang [2 ]
Chen, Meng [2 ]
He, Xiaodong [2 ]
机构
[1] Beijing Inst Technol, Beijing, Peoples R China
[2] JD AI, Beijing, Peoples R China
[3] Australian Natl Univ, Canberra, ACT, Australia
基金
国家重点研发计划;
关键词
multimodal named entity recognition; machine reading comprehension; visual grounding; transfer learning;
D O I
10.1145/3503161.3548427
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Multimodal named entity recognition (MNER) is a vision-language task where the system is required to detect entity spans and corresponding entity types given a sentence-image pair. Existing methods capture text-image relations with various attention mechanisms that only obtain implicit alignments between entity types and image regions. To locate regions more accurately and better model cross-/within-modal relations, we propose a machine reading comprehension based framework for MNER, namely MRC-MNER. By utilizing queries in MRC, our framework can provide prior information about entity types and image regions. Specifically, we design two stages, Query-Guided Visual Grounding and Multi-Level Modal Interaction, to align fine-grained type-region information and simulate text-image/inner-text interactions respectively. For the former, we train a visual grounding model via transfer learning to extract region candidates that can be further integrated into the second stage to enhance token representations. For the latter, we design text-image and inner-text interaction modules along with three sub-tasks for MRC-MNER. To verify the effectiveness of our model, we conduct extensive experiments on two public MNER datasets, Twitter2015 and Twitter2017. Experimental results show that MRC-MNER outperforms the current state-of-the-art models on Twitter2017, and yields competitive results on Twitter2015.
引用
收藏
页码:3549 / 3558
页数:10
相关论文
共 50 条
  • [1] MNER-QG: An End-to-End MRC Framework for Multimodal Named Entity Recognition with Query Grounding
    Jia, Meihuizi
    Shen, Lei
    Shen, Xin
    Liao, Lejian
    Chen, Meng
    He, Xiaodong
    Chen, Zhendong
    Li, Jiaqi
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 7, 2023, : 8032 - 8040
  • [2] SpanMRC: Query with Entity Length for MRC-Based Named Entity Recognition
    Wu, Hao
    Li, Xianxian
    Liu, Peng
    Wang, Li-e
    Yang, Danping
    Zhou, Aoxiang
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT IV, ICIC 2024, 2024, 14878 : 281 - 293
  • [3] MPMRC-MNER: A Unified MRC framework for Multimodal Named Entity Recognition based Multimodal Prompt
    Bao, Xigang
    Tian, Mengyuan
    Zha, Zhiyuan
    Qin, Biao
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 47 - 56
  • [4] Judicial nested named entity recognition method with MRC framework
    Zhang H.
    Guo J.
    Wang Y.
    Zhang Z.
    Zhao H.
    International Journal of Cognitive Computing in Engineering, 2023, 4 : 118 - 126
  • [5] Named Entity Recognition in Query
    Guo, Jiafeng
    Xu, Gu
    Cheng, Xueqi
    Li, Hang
    PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 267 - 274
  • [6] ANeTCM: A Novel MRC Framework for Traditional Chinese Medicine Named Entity Recognition
    Feng, Yuanyu
    Zhou, Yan
    IEEE ACCESS, 2024, 12 : 113235 - 113243
  • [7] A Survey on Multimodal Named Entity Recognition
    Qian, Shenyi
    Jin, Wenduo
    Chen, Yonggang
    Ma, Jiangtao
    Qiao, Yaqiong
    Lu, Jinyu
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT IV, 2023, 14089 : 609 - 622
  • [8] A Multi-expert Collaborative Framework for Multimodal Named Entity Recognition
    Xu, Bo
    Jiang, Haiqi
    Wei, Shouang
    Du, Ming
    Song, Hui
    Wang, Hongya
    MULTIMEDIA MODELING, MMM 2025, PT I, 2025, 15520 : 30 - 43
  • [9] MAF: A General Matching and Alignment Framework for Multimodal Named Entity Recognition
    Xu, Bo
    Huang, Shizhou
    Sha, Chaofeng
    Wang, Hongya
    WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2022, : 1215 - 1223
  • [10] IMPROVING BIOMEDICAL NAMED ENTITY RECOGNITION WITH A UNIFIED MULTI-TASK MRC FRAMEWORK
    Tong, Yiqi
    Zhuang, Fuzhen
    Wang, Deqing
    Ying, Haochao
    Wang, Binling
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8332 - 8336