QA-Matcher: Unsupervised Entity Matching Using a Question Answering Model

被引:0
|
作者
Hayashi, Shogo [1 ,3 ]
Dong, Yuyang [2 ]
Oyamada, Masafumi [2 ]
机构
[1] BizReach Inc, Tokyo, Japan
[2] NEC Corp Ltd, Tokyo, Japan
[3] NEC Corp Ltd, Tokyo, Japan
关键词
entity matching; question answering;
D O I
10.1007/978-3-031-33383-5_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Entity matching (EM) is a fundamental task in data integration, which involves identifying records that refer to the same real-world entity. Unsupervised EM is often preferred in real-world applications, as labeling data is often a labor-intensive process. However, existing unsupervised methods may not always perform well because the assumptions for these methods may not hold for tasks in different domains. In this paper, we propose QA-Matcher, an unsupervised EM model that is domain-agnostic and doesn't require any particular assumptions. Our idea is to frame EM as question answering (QA) by utilizing a trained QA model. Specifically, we generate a question that asks which record has the characteristics of a particular record and a passage that describes other records. We then use the trained QA model to predict the record pair that corresponds to the question-answer as a match. QA-Matcher leverages the power of a QA model to represent the semantics of various types of entities, allowing it to identify identical entities in a QA-like fashion. In extensive experiments on 16 real-world datasets, we demonstrate that QA-Matcher outperforms unsupervised EM methods and is competitive with supervised methods.
引用
收藏
页码:174 / 185
页数:12
相关论文
共 50 条
  • [21] Choose Your QA Model Wisely: A Systematic Study of Generative and Extractive Readers for Question Answering
    Luo, Man
    Hashimoto, Kazuma
    Yavuz, Semih
    Liu, Zhiwei
    Baral, Chitta
    Zhou, Yingbo
    PROCEEDINGS OF THE 1ST WORKSHOP ON SEMIPARAMETRIC METHODS IN NLP: DECOUPLING LOGIC FROM KNOWLEDGE (SPA-NLP 2022), 2022, : 7 - 22
  • [22] HAS-QA: Hierarchical Answer Spans Model for Open-Domain Question Answering
    Pang, Liang
    Lan, Yanyan
    Guo, Jiafeng
    Xu, Jun
    Su, Lixin
    Cheng, Xueqi
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6875 - 6882
  • [23] Using Vector Space Model in Question Answering System
    Jovita
    Linda
    Hartawan, Andrei
    Suhartono, Derwin
    INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL INTELLIGENCE (ICCSCI 2015), 2015, 59 : 305 - 311
  • [24] Knowledge Base Question Answering With a Matching-Aggregation Model and Question-Specific Contextual Relations
    Lan, Yunshi
    Wang, Shuohang
    Jiang, Jing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (10) : 1629 - 1638
  • [25] MEDICAL DATA INQUIRY USING A QUESTION ANSWERING MODEL
    Liao, Zhibin
    Liu, Lingqiao
    Wu, Qi
    Teney, Damien
    Shen, Chunhua
    van den Hengel, Anton
    Verjans, Johan
    2020 IEEE 17TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2020), 2020, : 1490 - 1493
  • [26] Using Semantic Text Similarity calculation for question matching in a rheumatoid arthritis question-answering system
    Li, Meiting
    Shen, Xifeng
    Sun, Yuanyuan
    Zhang, Weining
    Nan, Jiale
    Zhu, Jia'an
    Gao, Dongping
    QUANTITATIVE IMAGING IN MEDICINE AND SURGERY, 2023, 13 (04) : 2183 - 2196
  • [27] A Weighted Question Retrieval Model using Descriptive Information in Community Question Answering
    Hong, Beomseok
    Kim, Yanggon
    2016 RESEARCH IN ADAPTIVE AND CONVERGENT SYSTEMS, 2016, : 35 - 39
  • [28] Automated question answering using question templates that cover the conceptual model of the database
    Sneiders, E
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, 2002, 2553 : 235 - 239
  • [29] Graph-level Semantic Matching model for Knowledge base Aggregate Question Answering
    Liu, Ya
    Wu, Shaojuan
    Zhang, Jiarui
    Han, Linyi
    Zhang, Xiaowang
    Yu, Yongxin
    Feng, Zhiyong
    COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2022, WWW 2022 COMPANION, 2022, : 307 - 310
  • [30] Multi-hop Knowledge Base Question Answering with an Iterative Sequence Matching Model
    Lan, Yunshi
    Wang, Shuohang
    Jiang, Jing
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 359 - 368