Survey on Data Integration Technologies for Relational Data and Knowledge Graph

被引:0
|
作者
Gao Y.-J. [1 ]
Ge C.-C. [2 ]
Guo Y.-X. [1 ]
Chen L. [1 ]
机构
[1] College of Computer Science and Technology, Zhejiang University, Hangzhou
[2] Data Intelligence Innovation Lab, Huawei Cloud Computing Technologies Co. Ltd., Hangzhou
来源
Ruan Jian Xue Bao/Journal of Software | 2023年 / 34卷 / 05期
关键词
data integration; knowledge graph (KG); relational data;
D O I
10.13328/j.cnki.jos.006808
中图分类号
学科分类号
摘要
Recently, big data is considered a critical strategic resource by many countries and regions. However, difficult data circulation and insufficient data regulation commonly exist in the big data era, thereby leading to the serious phenomenon of data silos, poor data quality, and difficulty in unleashing the potential of data elements. This provokes researchers to explore data integration techniques for breaking data barriers, enabling data sharing, improving data quality, and activating the potential of data elements. Relational data and knowledge graphs, as two significant forms of data organization and storage, have been widely applied in real life. To this end, this study focuses on relational data and knowledge graphs to summarize and analyze the key technologies of data integration, including entity resolution, data fusion, and data cleaning. Finally, it prospects future research directions. © 2023 Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:2365 / 2391
页数:26
相关论文
共 181 条
  • [51] Li P, Cheng X, Chu X, He YY, Chaudhuri S., Auto-FuzzyJoin: Auto-program fuzzy similarity joins without labeled examples, Proc. of the 2021 Int’l Conf. on Management of Data, pp. 1064-1076, (2021)
  • [52] Zhang DX, Li DS, Guo L, Tan KL., Unsupervised entity resolution with blocking and graph algorithms, IEEE Trans. on Knowledge and Data Engineering, 34, 3, pp. 1501-1515, (2022)
  • [53] Ge CC, Wang PF, Chen L, Liu XZ, Zheng BH, Gao YJ., CollaborEM: A self-supervised entity matching framework using multi-features collaboration, IEEE Trans. on Knowledge and Data Engineering, (2021)
  • [54] Mahdisoltani F, Biega J, Suchanek FM., YAGO3: A knowledge base from multilingual wikipedias, Proc. of the 7th Biennial Conf. on Innovative Data Systems Research, pp. 1-11, (2015)
  • [55] Jimenez-Ruiz E, Cuenca Grau B., LogMap: Logic-based and scalable ontology matching, Proc. of the 10th Int’l Semantic Web Conf, pp. 273-288, (2011)
  • [56] Zhuang Y, Li GL, Zhong ZJ, Feng JH., Hike: A hybrid human-machine method for entity alignment in large-scale knowledge bases, Proc. of the 2017 ACM on Conf. on Information and Knowledge Management, pp. 1917-1926, (2017)
  • [57] Suchanek FM, Abiteboul S, Senellart P., PARIS: Probabilistic alignment of relations, instances, and schema, Proc. of the VLDB Endowment, 5, 3, pp. 157-168, (2011)
  • [58] Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O., Translating embeddings for modeling multi-relational data, Proc. of the 26th Annual Conf. on Neural Information Processing Systems, pp. 2787-2795, (2013)
  • [59] Trouillon T, Welbl J, Riedel S, Gaussier E, Bouchard G., Complex embeddings for simple link prediction, Proc. of the 33rd Int’l Conf. on Machine Learning, pp. 2071-2080, (2016)
  • [60] Chen MH, Tian YT, Yang MH, Zaniolo C., Multilingual knowledge graph embeddings for cross-lingual knowledge alignment, Proc. of the 26th Int’l Joint Conf. on Artificial Intelligence, pp. 1151-1517, (2017)