LargeEA: Aligning Entities for Large-scale Knowledge Graphs

被引:12
|
作者
Ge, Congcong [1 ]
Liu, Xiaoze [1 ]
Chen, Lu [1 ]
Gao, Yunjun [1 ]
Zheng, Baihua [2 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, Hangzhou, Peoples R China
[2] Singapore Management Univ, Sch Comp & Informat Syst, Singapore, Singapore
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2021年 / 15卷 / 02期
关键词
ALIGNMENT;
D O I
10.14778/3489496.3489504
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Entity alignment (EA) aims to find equivalent entities in different knowledge graphs (KGs). Current EA approaches suffer from scalability issues, limiting their usage in real-world EA scenarios. To tackle this challenge, we propose LargeEA to align entities between large-scale KGs. LargeEA consists of two channels, i.e., structure channel and name channel. For the structure channel, we present METIS-CPS, a memory-saving mini-batch generation strategy, to partition large KGs into smaller mini-batches. LargeEA, designed as a general tool, can adopt any existing EA approach to learn entities' structural features within each mini-batch independently. For the name channel, we first introduce NFF, a name feature fusion method, to capture rich name features of entities without involving any complex training process; we then exploit a name-based data augmentation to generate seed alignment without any human intervention. Such design fits common real-world scenarios much better, as seed alignment is not always available. Finally, LargeEA derives the EA results by fusing the structural features and name features of entities. Since no widely-acknowledged benchmark is available for large-scale EA evaluation, we also develop a large-scale EA benchmark called DBP1M extracted from real-world KGs. Extensive experiments confirm the superiority of LargeEA against state-of-the-art competitors.
引用
收藏
页码:237 / 245
页数:9
相关论文
共 50 条
  • [41] NORIA UI: Efficient Incident Management on Large-Scale ICT Systems Represented as Knowledge Graphs
    Tailhardat, Lionel
    Chabot, Yoan
    Py, Antoine
    Guillemette, Perrine
    19TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY, AND SECURITY, ARES 2024, 2024,
  • [42] Semantically Constitutive Entities in Knowledge Graphs
    Chia, Chong Cher
    Tkachenko, Maksim
    Lauw, Hady W.
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2023, PT I, 2023, 14146 : 445 - 461
  • [43] ALADYN: a web server for aligning proteins by matching their large-scale motion
    Potestio, R.
    Aleksiev, T.
    Pontiggia, F.
    Cozzini, S.
    Micheletti, C.
    NUCLEIC ACIDS RESEARCH, 2010, 38 : W41 - W45
  • [44] Scalable Motif Counting for Large-scale Temporal Graphs
    Gao, Zhongqiang
    Cheng, Chuanqi
    Yu, Yanwei
    Cao, Lei
    Huang, Chao
    Dong, Junyu
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 2656 - 2668
  • [45] ALLIE: Active Learning on Large-scale Imbalanced Graphs
    Cui, Limeng
    Tang, Xianfeng
    Katariya, Sumeet
    Rao, Nikhil
    Agrawal, Pallav
    Subbian, Karthik
    Lee, Dongwon
    PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 690 - 698
  • [46] The Use of Weighted Graphs for Large-Scale Genome Analysis
    Zhou, Fang
    Toivonen, Hannu
    King, Ross D.
    PLOS ONE, 2014, 9 (03):
  • [48] Algorithms for generating large-scale clustered random graphs
    Wang, Cheng
    Lizardo, Omar
    Hachen, David
    NETWORK SCIENCE, 2014, 2 (03) : 403 - 415
  • [49] On Elegant Labelling and Magic Labelling of Large-Scale Graphs
    Su, Jing
    Wang, Hongyu
    Yao, Bing
    DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2022, 2022
  • [50] Distributed Approaches to Core Decomposition on Large-scale Graphs
    Weng, Tong-Feng
    Zhou, Xu
    Li, Ken-Li
    Hu, Yi-Kun
    Ruan Jian Xue Bao/Journal of Software, 2024, 35 (12): : 5341 - 5362