VersaMatch: Ontology Matching with Weak Supervision

被引:2
|
作者
Furst, Jonathan [1 ]
Argerich, Mauricio Fadel [2 ]
Cheng, Bin [3 ]
机构
[1] Zurich Univ Appl Sci, NEC Labs Europe, Zurich, Switzerland
[2] Univ Politecn Madrid, NEC Labs Europe, Madrid, Spain
[3] Springer Nat, NEC Labs Europe, Berlin, Germany
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2023年 / 16卷 / 06期
关键词
D O I
10.14778/3583140.3583148
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Ontology matching is crucial to data integration for across-silo data sharing and has been mainly addressed with heuristic and machine learning (ML) methods. While heuristic methods are often inflexible and hard to extend to new domains, ML methods rely on substantial and hard to obtain amounts of labeled training data. To overcome these limitations, we propose VersaMatch, a flexible, weakly-supervised ontology matching system. VersaMatch employs various weak supervision sources, such as heuristic rules, pattern matching, and external knowledge bases, to produce labels from a large amount of unlabeled data for training a discriminative ML model. For prediction, VersaMatch develops a novel ensemble model combining the weak supervision sources with the discriminative model to support generalization while retaining a high precision. Our ensemble method boosts end model performance by 4 points compared to a traditional weak-supervision baseline. In addition, compared to state-of-the-art ontology matchers, VersaMatch achieves an overall 4-point performance improvement in F1 score across 26 ontology combinations from different domains. For recently released, in-the-wild datasets, VersaMatch beats the next best matchers by 9 points in F1. Furthermore, its core weak-supervision logic can easily be improved by adding more knowledge sources and collecting more unlabeled data for training.
引用
收藏
页码:1305 / 1318
页数:14
相关论文
共 50 条
  • [1] Ontology-driven weak supervision for clinical entity classification in electronic health records
    Jason A. Fries
    Ethan Steinberg
    Saelig Khattar
    Scott L. Fleming
    Jose Posada
    Alison Callahan
    Nigam H. Shah
    Nature Communications, 12
  • [2] Ontology-driven weak supervision for clinical entity classification in electronic health records
    Fries, Jason A.
    Steinberg, Ethan
    Khattar, Saelig
    Fleming, Scott L.
    Posada, Jose
    Callahan, Alison
    Shah, Nigam H.
    NATURE COMMUNICATIONS, 2021, 12 (01)
  • [3] Ontology Segmentation in Ontology Matching
    Senturk, Fatmana
    Aytac, Vecdi
    2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2017, : 1068 - 1071
  • [4] Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots
    Wu, Yu
    Wu, Wei
    Li, Zhoujun
    Zhou, Ming
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 420 - 425
  • [5] Interpretative Ontology Supervision and Diagnostic
    Rodriguez, Taniana
    Aguilar, Jose
    Rios, Addison
    Rivas, Francklin
    Subias, Audine
    PROCEEDINGS OF THE 2013 XXXIX LATIN AMERICAN COMPUTING CONFERENCE (CLEI), 2013,
  • [6] The Weak Supervision Landscape
    Poyiadzi, Rafael
    Bacaicoa-Barber, Daniel
    Cid-Sueiro, Jesus
    Perello-Nieto, Miquel
    Flach, Peter
    Santos-Rodriguez, Raul
    2022 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS WORKSHOPS AND OTHER AFFILIATED EVENTS (PERCOM WORKSHOPS), 2022,
  • [7] Matching with a hierarchical ontology
    Choueka, Yaacov
    Dershowitz, Nachum
    Tal, Liad
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8001 : 395 - 398
  • [8] Reverse Ontology Matching
    Martinez-Gil, Jorge
    Aldana-Montes, Jose F.
    SIGMOD RECORD, 2010, 39 (04) : 5 - 11
  • [9] The Survey For Ontology Matching
    Chu, Yanping
    Zhu, Changjiang
    MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 6219 - 6222
  • [10] Scalable ontology matching
    Zolfaghari, Vahideh
    Jalali, Mehradad
    2014 IRANIAN CONFERENCE ON INTELLIGENT SYSTEMS (ICIS), 2014,