ENHANCING SEMANTIC WEB ENTITY MATCHING PROCESS USING TRANSFORMER NEURAL NETWORKS AND PRE-TRAINED LANGUAGE MODELS

被引：0

作者：

Jabrane, Mourad ^{[1
]}

Toulaoui, Abdelfattah ^{[1
]}

Hafidi, Imad ^{[1
]}

机构：

[1] Sultan Moulay Slimane Univ, Lab Proc Engn Comp Sci & Math, Bd Beni Amir,BP 77, Khouribga, Morocco

来源：

COMPUTING AND INFORMATICS | 2024年 / 43卷 / 06期

关键词：

Entity matching; record linkage; linked data; deep learning; transformer neural networks;

D O I：

10.31577/cai202461397

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Entity matching (EM) is a critical yet complex component of data cleaning and integration. Recent advancements in EM have predominantly been driven by deep learning (DL) methods. These methods primarily enhance data accuracy within structured data that adheres to a high-quality and well-defined schema. However, these schema-centric DL strategies struggle with the semantic web's linked data, which tends to be voluminous, semi-structured, diverse, and often noisy. To tackle this, we introduce a novel approach that is loosely schema-aware and leverages cutting-edge developments in DL, specifically transformer neural networks and pre-trained language models. We evaluated our approach on six datasets, including two tabular and four RDF datasets from the semantic web. The findings demonstrate the effectiveness of our model in managing the complexities of noisy and varied data.

引用

页码：1397 / 1415

页数：19

共 50 条

[21] Efficient Aspect Object Models Using Pre-trained Convolutional Neural Networks
Wilkinson, Eric
Takahashi, Takeshi
2015 IEEE-RAS 15TH INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS), 2015, : 284 - 289
[22] Enhancing pre-trained language models with Chinese character morphological knowledge
Zheng, Zhenzhong
Wu, Xiaoming
Liu, Xiangzhi
INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (01)
[23] Enhancing radiology report generation through pre-trained language models
Leonardi, Giorgio
Portinale, Luigi
Santomauro, Andrea
PROGRESS IN ARTIFICIAL INTELLIGENCE, 2024,
[24] Recent Progress on Named Entity Recognition Based on Pre-trained Language Models
Yang, Binxia
Luo, Xudong
2023 IEEE 35TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2023, : 799 - 804
[25] A Simple but Effective Pluggable Entity Lookup Table for Pre-trained Language Models
Ye, Deming
Lin, Yankai
Li, Peng
Sun, Maosong
Liu, Zhiyuan
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): (SHORT PAPERS), VOL 2, 2022, : 523 - 529
[26] Somun: entity-centric summarization incorporating pre-trained language models
Inan, Emrah
NEURAL COMPUTING & APPLICATIONS, 2021, 33 (10): : 5301 - 5311
[27] Somun: entity-centric summarization incorporating pre-trained language models
Emrah Inan
Neural Computing and Applications, 2021, 33 : 5301 - 5311
[28] A graph-based blocking approach for entity matching using pre-trained contextual embedding models*
Mugeni, John Bosco
Amagasa, Toshiyuki
37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 357 - 364
[29] A Survey of Controllable Text Generation Using Transformer-based Pre-trained Language Models
Zhang, Hanqing
Song, Haolin
Li, Shaoyu
Zhou, Ming
Song, Dawei
ACM COMPUTING SURVEYS, 2024, 56 (03)
[30] Incident detection and classification in renewable energy news using pre-trained language models on deep neural networks
Wang, Qiqing
Li, Cunbin
JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2022, 22 (01) : 57 - 76

← 1 2 3 4 5 →